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PREFACE 


In the past few years there has been a great resurgence of interest in 
mathematics on both the secondary and undergraduate levels, and a 
growing recognition that the courses traditionally offered do not exhaust 
the mathematics which it is both possible and desirable to teach at those 
levels. Of course, not all of modern mathematics is accessible; some of it 
is too abstract to be comprehensible without more training in mathemati- 
cal thinking, and some of it requires more technical knowledge than the 
young student can have mastered. Happily, the theory of numbers pre- 
sents neither of these difficulties. The subject matter is the very concrete 
set of whole numbers, the rules are those the student has been accustomed 
to since grade school, and no assumption need be made as to special prior 
knowledge. To be sure, the results are not directly applicable in the 
physical world, but it is difficult to name a branch of mathematics in 
which the student encounters greater variety in types of proofs, or in 
which he will find more simple problems to stimulate his interest, chal- 
lenge his ability, and increase his mathematical strength. For these and 
a number of other reasons, both the School Mathematics Study Group 
and the Committee on Undergraduate Programs have advocated the 
teaching of number theory to high-school and college students. 

The present book is the result of an attempt to expose the subject 
in such form as to be accessible to persons with less mathematical training 
than those who would normally read, for example, the author’s Topics in 
Number Theory, Volume I. There is considerable overlapping in material, 
of course—it is, after all, the same subject—but the exposition is more 
leisurely, the examples and computational problems are more numerous, 
and certain relatively difficult topics have been omitted. Furthermore, 
the chapter on Gaussian arithmetic is entirely new, and the chapters on 
continued fractions and Diophantine equations have been almost entirely 
rewritten. I hope that the book may prove useful in high-school enrich- 
ment programs, in nontraditional freshman and sophomore courses, and 
in teacher training and refresher programs. 

Certain problems are starred to indicate greater than average difficulty. 


W. J. 1. 
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CHAPTER 1 
INTRODUCTION 


1-1 What is number theory? In number theory we are concerned with 
properties of certain of the integers (whole numbers) 


er eo ee oe eee 


or sometimes with those properties of real or complex numbers which 
depend rather directly on the integers. It might be thought that there is 
little more that can be said about such simple mathematical objects than 
what has already been said in elementary arithmetic, but if you stop to 
think for a moment, you will realize that heretofore integers have not been 
considered as interesting objects in their own right, but simply as useful 
carriers of information. After totaling a grocery bill, you are interested 
in the amount of money involved, and not in the number representing 
that amount of money. In considering sin 31°, you think either of an 
angular opening of a certain size, and the ratios of some lengths related to 
that angle, or of a certain position in a table of trigonometric functions, 
but not of any interesting properties that the number 31 might possess. 

The attitude which will govern the treatment of integers in this text is 
perhaps best exemplified by a story told by G. H. Hardy, an eminent 
British number theorist who died in 1947. Hardy had a young protégé, 
an Indian named Srinivasa Ramanujan, who had such a truly remarkable 
insight into hidden arithmetical relationships that, although he was almost 
uneducated mathematically, he did a great amount of first-rate original 
research in mathematics. Ramanujan was ill in a hospital in England, 
and Hardy went to visit him. When he arrived, he idly remarked that the 
taxi in which he had ridden had the license number 1729, which, he said, 
seemed to him a rather uninteresting number. Ramanujan immediately 
replied that, on the contrary, 1729 was singularly interesting, being the 
smallest positive integer expressible as a sum of two positive cubes in two 
different ways, namely 1729 = 10°? + 9° = 12% + 13! 

It should not be inferred that one needs to know all such little facts 
to understand number theory, or that one needs to be a lightning cal- 
culator; we simply wished to make the point that the question of what the 
smallest integer is which can be represented as a sum of cubes in two ways 
is of interest to a number theorist. It is interesting not so much for its 
own sake (after all, anyone could find the answer after a few minutes of 
unimaginative computation), but because it raises all sorts of further 
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questions whose answers are by no means simple matters of calculation. 
For example, if s is any positive integer, about how large is the smallest 
integer representable as a sum of cubes of positive integers in s different 
ways? Or, are there infinitely many integers representable as a sum of 
cubes in two different ways? Or, how can one characterize in a different 
fashion the integers which can be represented as a sum of two cubes in at 
least one way? Or, are any cubes representable as a sum of two cubes? 
That is, has the equation 


ety z3 (1) 


any solutions in positive integers x,y, and z? These questions, like that 
discussed by Hardy and Ramanujan, are concerned with integers, but 
they also have an additional element which somehow makes them more 
significant: they are concerned not with a particular integer, but with 
whole classes or collections of integers. It is this feature of generality, 
perhaps, which distinguishes the theory of numbers from simple arithmetic. 
Still, there is a gradual shading from one into the other, and number theory 
is, appropriately enough, sometimes called higher arithmetic. 

In view of the apparent simplicity of the subject matter, it is not sur- 
prising that number-theoretic questions have been considered throughout 
almost the entire history of recorded mathematics. One of the earliest 
such problems must have been that of solving the “Pythagorean” equation 


x? + y? = 27, (2) 


For centuries it was supposed that the classical theorem embodied in (2) 
concerning the sides of a right triangle was due either to Pythagoras or a 
member of his school (about 550 8.c). Recently interpreted cuneiform 
texts give strong evidence, however, that Babylonian mathematicians 
not only knew the theorem as early as 1600 8.c., but that they knew how 
to compute all integral solutions z, y, z of (2), and used this knowledge for 
the construction of crude trigonometric tables. There is no difficulty in 
finding a large number of integral solutions of (2) by trial and error—just 
add many different pairs of squares, and some of the sums will turn out 
to be squares also, Finding all solutions is another matter, requiring un- 
derstanding rather than patience. We shall treat this question in detail 
in Chapter 5. 

Whatever the Babylonians may have known and understood, it seems 
clear that we are indebted to the Greeks for their conception of mathe- 
matics as a systematic theory founded on axioms or unproved assumptions, 
developed by logical deduction and supported by strict proofs. It would 
probably not have occurred to the Babylonians to write out a detailed 
analysis of the integral solutions of (2), as Euclid did in the tenth book of 
his Elements. This contribution by Euclid was minor, however, compared 
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with his invention of what is now called the Euclidean algorithm, which 
we shall consider in the next chapter. Almost equally interesting was his 
proof that there are infinitely many prime numbers, a prime number being 
an integer such as 2, 3, 5, etc., which has no exact divisors except itself, 1, 
and the negatives of these two numbers.* We shall repeat this proof later 
in the present chapter. 

Another Greek mathematician whose work remains significant in present- 
day number theory is Diophantos, who lived in Alexandria, about 250 4.p. 
Many of his writings have been lost, but they all seem to have been con- 
cerned with the solution in integers (or sometimes in rational numbers) 
of various algebraic equations. In his honor we still refer to such equations 
as (1) and (2) above as Diophantine equations, not because they are spe- 
cial kinds of equations, but because special kinds of solutions are required. 
Diophantos considered a large number of such equations, and his work was 
continued by the Arabian Al-Karkhi (ca. 1030) and the Italian Leonardo 
Pisano (ca. 1200). Although it is possible that these latter works were 
known to Pierre Fermat (1601-1665), the founding father of number 
theory as a systematic branch of knowledge, it is certain that Fermat’s 
principal inspiration came directly from Diophantos’ works. 

The questions considered in the theory of numbers can be grouped ac- 
cording to a more or less rough classification, as will now be explained. It 
should not be inferred that every problem falls neatly into one of these 
classes, but simply that many questions of each of the following categories 
have been considered. 

First, there are multiplicative problems, concerned with the divisibility 
properties of integers. It will be proved later that any positive integer n 
greater than 1 can be represented uniquely, except for the order of the 
factors, as a product of one or more positive primes. For example, 


, 18 = 13, 2,892,384 = 25. 37-117. 83, 


and there is no essentially different factorization of these integers, if the 
factors are required to be primes. This unique factorization theorem, as 
it is called, might almost be termed the fundamental theorem of number 
theory, so manifold and varied are its applications. From the decomposi- 
tion of n into primes, it is easy to determine the number of positive di- 
visors (i.e., exact divisors) of n. This number, which of course depends on 
n, is called 7(n) by some writers and d(n) by others; we shall use the former 
designation (7 is the Greek letter tau; see the Greek alphabet in the ap- 


*The term prime will usually be reserved for the positive integers with this 
property; the numbers -2, -3, —5, etc., will be called negative primes. Note that 
1 is not included among the primes. 
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pendix). The behavior of r(m) is very erratic, as we can see by examining 
Table 1-1. If n = 2”, the divisors of n are 1, 2, 27,...,2”, so that 
7(2”) = m-+1. On the other hand, if n is a prime, then r(n) = 2. 
Since, as we shall see, there are infinitely many primes, it appears that the 
t-function has arbitrarily large values, and yet has the value 2 for in- 
finitely many n. A number of questions might occur to anyone who thinks 
about the subject for a few moments and studies the above table. For 
example: 


(a) Is it true that 7(n) is odd if and only if n is a square? 

(b) Is it always true that if m and n have no common factor larger than 
1, then r(m)r(n) = r(mn)? 

(c) How large can r(n) be in comparison with n? From the equation 


log 2” 
log 2 As 


72") = m+1= 


it might be guessed that perhaps there is a constant c such that 
t(n) < clogn (3) 


for all n. If this is false, is there any better upper bound than the 
trivial one, r(n) < n? (The last inequality is a consequence of the 
fact that only the n integers 1, 2,..., n could possibly divide n.) 

(d) How large is r(n) on the average? That is, what can be said about 
the quantity 


wy (rl) + (2) +++ + 2(N)) (4) 


as N increases indefinitely ? 


1-1] WHAT IS NUMBER THEORY? 5 


(e) For large N, approximately how many solutions n < N are there 


of the equation r(n) = 2? In other words, about how many primes 
are there among the integers 1, 2,...,N? 


Of these questions, which are fairly typical problems in multiplicative 
number theory, the first two are very easy to answer in the affirmative. 
The third and fourth are somewhat more difficult, and-we shall not con- 
sider them further in this book. However, to satisfy the reader’s curiosity, 
we mention that (3) is false for certain sufficiently large n, however large 
the constant c may be, whereas the inequality 7(n) < cn‘ is true for all 
sufficiently large n, however small the positive constants ¢ and € may be. 
On the other hand, for large N, the average value (4) is nearly equal” 
to log N. The last is very difficult indeed. It was conjectured independ- 
ently by C. F. Gauss and A. Legendre, in about 1800, that the number, 
commonly called a(N), of primes not exceeding N is approximately 
N/log N, in the sense that the relative error 


|m(N) — (N/log N)| _|_m(W)___ | 
N/log N N/log N 


is very small when N is sufficiently large. Many years later (1852-54), 
P. L. Chebyshev showed that if this relative error has any limiting value, 
it must be zero, but it was not until 1896 that J. Hadamard and C. de la 
Vallée Poussin finally proved what is now called the prime number 
theorem, that 


_ TN) _ 
iim Wflog N = 


In 1948 a much more elementary proof of this theorem was discovered by 
the Norwegian mathematician Atle Selberg and the Hungarian mathe- 
matician Paul Erdés; even this proof, however, is too difficult for inclusion 
here. 

Secondly, there are the problems of additive number theory: questions 
concerning the representability, and number of representations, of a posi- 
tive integer as a sum of integers of a specified kind. For example, certain 
integers such as 5 = 1? + 2? and 13 = 2? + 3? are representable as a 
sum of two squares, and some, such as 65 = 17 + 8? = 4? + 7°, have 
two such representations, while others, such as 6, have none. Which 
integers are so representable, and how many representations are there? 
If we use four squares instead of two, we obtain Table 1-2. 


*In the sense that the relative error is small. Here and elsewhere the log- 
arithm is the so-called natural logarithm, which is a certain constant 2.303... 
times the logarithm to the base ten. 
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TABLE 1-2 
1 = 12+0?+0?+02 11 = 327412412402 
2= 174+ 124 9?+4 0? 12 = 22+ 224 224 92 
3=2+12+41240? 13 = 32+ 2?+ 02+ 02 
4 = 22+ 9?+ 0?-+ 0? 14 = 32+ 2?-+4 12+ 9? 
5 = 224 124 92+ 9? 15 = 32+ 22+ 124 12 
6= 224+ 124+12+02 16 = 42+ 024 0?+4 0? 
7 = Q2+424 92442 17 = 424 124 924 02 
8 = 22+ 224 924 0? 18 = 32+ 32+ 0?-+ 0? 
9 = 3210710740? 19 = 324+ 324124 0? 
10 = 32+ 12+ 0?4 0? 20 = 47-4 22+ 9?+4 02 


From this or a more extensive table, it is reasonable to guess that every 
positive integer is representable as a sum of four squares of nonnegative 
integers. This is indeed a correct guess, which seems already to have 
been made by Diophantos. A proof was known to Pierre Fermat in 1636, 
but the first published proof was given by Joseph Louis Lagrange in 1770. 

More generally, it was proved by David Hilbert in 1909 that if we 
consider kth powers rather than squares, a certain fixed number. of them 
suffices for the representation of any positive integer. 

There are also some very interesting questions about sums of primes. 
It was conjectured by Charles Goldbach in 1742 that every even integer 
larger than 4 is the sum of two odd primes. (All primes except 2 are odd, 
of course, since evenness means divisibility by 2.) Despite enormous efforts 
in the 200 intervening years by many excellent mathematicians, the 
truth or falsity of Goldbach’s conjecture has not been settled to this day. 
It is known, however, that every odd integer larger than 10259:° is the 
sum of three odd primes, which implies that every even integer larger than 
this same number is the sum of four primes. It has also been conjectured, 
so far without proof, that every even integer is representable in infinitely 
many ways as the difference of two primes. In particular, this would 
mean that there are infinitely many prime twins, such as 17 and 19, or 
101 and 103, which differ by 2. 

As a third class of problems, there are the Diophantine equations men- 
tioned earlier. Here the general theory is rather scanty, since the subject 
is intrinsically very difficult. In Chapter 3 we shall give a complete 
analysis of the linear equation in two unknowns, az + by = c; that is, 
we shall determine the exact conditions which a, b, and c must satisfy in 
order that the equation be solvable in integers, and we shall present an 
effective procedure for finding these solutions. Certain quadratic equations, 
such as the Pythagorean equation x” + y? = 2” and the so-called Pell 


1-2] FOUNDATIONS 7 


equation, 2” — dy? = 1, can also be solved completely, but relatively 
little is known about higher-degree equations in general, although certain 
specific equations have been solved. For example, there is the conjecture 
due to Fermat, that the equation x” + y” = z” has no solutions in non- 
zero integers x, y, and z if n > 2. This is perhaps the oldest and best- 
known unsolved problem in mathematics. The conjecture is now known 
to be correct for all nm < 4000, and it is also known that if n is a prime 
smaller than 253,747,889, then there is no solution in which none of z, y, 
or z is divisible by n; but the general proposition remains out of reach. 

There are many other branches of number theory—Diophantine ap- 
proximation, geometry of numbers, theory of quadratic forms, and ana- 
lytic theory of numbers, to name a few—but their descriptions are more 
complicated, and since we shall not consider problems from such fields in 
this book, we shall not enter into details. In any case, no classification 
can be exhaustive, and perhaps enough examples have been given to show 
the typical flavor of number-theoretic questions. 

Granting that the reader now knows what number theory is, or that he 
will after reading this book, there is still the question of why anyone should 
create or study the subject. Certainly not because of its applicability to 
problems concerning the physical world; such applications are extremely 
rare. The theory of numbers has, on the other hand, been a strong in- 
fluence in the development of higher pure mathematics, both in stimulating 
the creation of powerful general methods in the course of solving special 
problems (such as the Fermat conjecture above, and the prime number 
theorem) and as a source of ideas and inspiration comparable to geometry 
and the mathematics of physical phenomena; and so in retrospect it turns 
out to have been worth developing. But these were not the reasons that 
led men to ponder arithmetical questions, in former times, nor are they the 
reasons for the present day interest in the theory of numbers. The driving 
force is rather man’s insatiable curiosity—the drive to know and do 
everything. In the case at hand this curiosity is whetted considerably by 
the surprising difficulty of the subject, maintained by its tremendous 
diversity, and rewarded by the elegance and unexpectedness of the re- 
sults. It is these attributes, perhaps, which led Carl Friedrich Gauss 
(1777-1855), one of the two or three greatest mathematicians who ever 
lived, to label-the theory of numbers the Queen of mathematics. 


1-2 Foundations. In the remaining chapters of this book we shall 
adopt the attitude that the integers and the basic arithmetical operations 
by means of which they are combined have already been comprehended 
by the reader, and we shall not dwell on such questions as what the 
integers are, or why 2+2 = 4and2+3 = 3-+2. Detailed logical 
developments of these topics exist (see, for example, E. Landau’s Founda- 
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tions of Analysis, Chelsea Publ. Co., New York, 1951), and anyone seri- 
ously interested in mathematics should examine a book on this subject 
at some time, to see what is really behind the arithmetic and elementary 
algebra he learned in grade school. That is not our objective here, how- 
ever, and in this section and the next two we shall simply single out a few 
matters which may be genuinely new to the student, and review the re- 
mainder very quickly. 

The arithmetic of the integers, like the geometry of the plane, can be 
made to depend on a few axioms, in the sense that everything else follows 
from them by accepted logical rules. One such set of axioms was given 
by G. Peano in 1889; it characterizes the set (class, collection) of natural 
numbers 1, 2, 3,..., and consists of the following postulates: 


(1) 1 is a natural number. 

(2) To each natural number z there corresponds a second natural num- 
ber 2’, called the successor of x. 

(3) 1 is not the successor of any natural number. 

(4) From 2’ = y’ follows z = y. 

(5) Let M be a set of natural numbers with the following two properties: 
(a) 1 belongs to WM. 
(b) If x belongs to M, then 2’ also belongs to M. 


Then M contains all natural numbers. 


In the language of these axioms, addition is defined by setting 
a+l=2',2+2 = (2), etc., and multiplication is defined in terms of 
addition: ab = a + a-+---- a, where there are b terms on the right. 
The usual rules of algebra can then be deduced, as they apply to the 
natural numbers, and the inequality symbol “<” can be introduced. 
Finally, zero and the negative integers are defined in terms of the natural 
numbers by various devices. 

This path is rather long, and it might be worth while also to list a second 
set of axioms, which become theorems if one starts from the Peano postu- 
lates, and which are more numerous and complicated than the latter, but 
which relate more directly to the workaday world of algebra and arith- 
metic and, in addition, suffice to deduce all further rules in these subjects. 


J. Each pair of integers a and b has a unique sum a + 6 and a unique 
product a-b or ab, such that the following associative, commuta- 
tive, and distributive laws hold: 


Addition Multiplication 
Associative law: a+(b+c)=(a+b)+c a(be) = (abe 
Commutative law: atb=b+a ab = ba 


Distributive law: a(b +c) = ab+ ace 
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Il. The distinct integers 0 and 1 have the properties that a +0 =a 
and 1- a= a for all integers a. 

III. For each integer a the equation a + + = 0 hasa unique solution z, 
which we call —a. 
IV. If c # 0 and ca = cb, then a = b. 

V. There is a subset of integers, called positive integers, with the 
following properties: the sum and product of positive integers are 
positive, and every nonzero integer a has the property that exactly 
one of the two numbers a and —a is positive. 


The notion of inequality can be introduced in terms of the positive inte- 
gers: we say that a is smaller than b, and write a < }, if b — ais a posi- 
tive integer, and we write a < b if either a < b ora = b. It can then 
be proved that if a < b, then a+c < b+ c for all integers c, and if 
a < bandO < c¢, then ac < be. The absolute value |a| of an integer a is 
0 if a = 0; otherwise it is the positive one of the two numbers a, —a. 


VI. Every set of positive integers which contains at least one member 
contains a smallest member. That is, there is an integer @ in the 
set such that a < b for every b in the set. 


Presumably the first five axioms of this second set are already familiar 
to the reader as working rules, although perhaps not in the explicit form 
presented here. The sixth could hardly be surprising, as a “fact” about 
the integers, but it may be surprising how useful it is as a device for 
proving theorems. We shall devote the next section to that topic, but we 
give a first example of it now by showing how the last Peano postulate 
follows from the above six axioms. 

First we prove that there zs no integer between 0 and 1. For if there is at 
least one such integer, then the set, call it A, of integers a such that 
0 < a < 1, has at least one member. (More briefly, we say that A is not 
empty.) By Axiom VI, A has a smallest element, say b. But by multiply- 
ing through in the inequality 0 < b < 1 by the positive number b we 
obtain 0 < b? < b, so that b? is also an element of A, and is smaller 
than b, contrary to the definition of b. Hence A cannot contain a smallest 
element, and so must be empty. a* 

We can now deduce the fifth Peano postulate from Axiom VI above. 
Suppose that M is a set of integers having the two properties described 
in Postulate 5, and let S be the set of all positive integers not in M@— 
the so-called complementary set to M. It suffices to show that S is empty, 
to see that M contains all positive integers. Suppose that S contains at 
least one element; then it contains a smallest element, say d. But d ¥ 1, 
by the first property in Postulate 5, and since there is no positive integer 


* This symbol signals the end of a proof. 
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between 0 and 1, we have d > 1. But then d — 1 is positive, and since 
d is the smallest positive integer in S, the smaller number d — 1 must be 
in M. But by the second property in Postulate 5, (@ — 1) + 1 = d must 
also be in M, which is false. Having arrived at a contradiction, our as- 
sumptions must have been inconsistent with one another, so that if M 
is in fact a set having the properties listed in Postulate 5, then its com- 
plementary set must be empty, and M must contain all positive integers. A 

It is customary to call Postulate 5 the induction axiom, and Axiom VI 
the well-ordering axiom. (A set of numbers is said to be well ordered if 
every subset has a smallest element.) We have just seen that the second 
implies the first (if Axioms J through V are assumed), and conversely, 
it is possible to show that the first implies the second. They are therefore 
different versions of the same principle, and can be used interchangeably, 
as we shall now see. 


1-3 Proofs by induction. On many occasions, both in this book and in 
the student’s later mathematical work, a theorem must be proved which is 
of the form, “For every positive integer n, the sentence P(n) is true.” 
Here we have used P(n) as a name for some sentence or other which in- 
volves an integer-valued variable n. For example, P(n) might be “n is 
the sum of the squares of four nonnegative integers,” or “the sum of the 
integers from 1 to n inclusive is n(n + 1)/2,” or “(1 + 2)” > 1+ nz 
for z > —1.” The induction axiom can frequently be used to prove such 
theorems in the following way. In the fifth Peano postulate we take for Mf 
the set of positive integers n for which P(n) is true; then showing that 1 
contains all positive integers amounts to showing that P(n) is true for all 
n, as asserted. What must be done, then, is to show that M contains 
1 [ie., that P(1) is a true sentence] and that M contains the successor of 
each of its elements [ie., that if P(m) is true, so also is P(m + 1), for every 
positive integer m.] 

Let us apply this to the proof of the second example above: 

For every positive integer n, we have 


PAD A cee = ), 


We must first show that P(1) is true, which is obvious: 


_ Wt). 
Paap 


Next, suppose that m is an integer such that 


mim + 1). 


14+24+---+m= 7) 
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If we add m + 1 to both sides of this true equation, we obtain 


m(m + 1) 
2 


= (m+) (F+1) 


1424+. m4 (m+) = + (m +1) 


so that 


LH2$e--4 mt (m4 1) = Stat, 


which is exactly the sentence P(m + 1). Thus P(m + 1) is true when- 
ever P(m) is, and hence by the induction axiom, P(n) is true for all positive 
integers 7. A 

As a second example, consider the Fibonacci sequence 


1, 1, 2, 3, 5, 8, 18, 21,..., 


in which every element after the second is the sum of the two numbers 
immediately preceding it. (In general, a sequence is an ordered array of 
elements, having a first element, a second element, etc.) If we denote by 
Un the nth element of this sequence, then the sequence is defined by the 
conditions 


uu >= 1, 
U2 >= 1, 
Un = Un—1 + Un—2, n > 3. (5) 


(This is an example of a recursive definition, in which infinitely many 
elements are defined, each element, after a certain point, being defined in 
terms of preceding elements.) Consider the following theorem: no two 
successive numbers tu, and Un41 have a common factor greater than 1. This 
can be rephrased in the form considered above, as follows: 


For every positive integer n, Un and Un41 have no common factor greater 
than 1. 


Clearly P(1) is true: 1 and 1 have no common factor greater than 1. 
Suppose now that m is any integer for which P(m) is true: um and tUm+1 
have no common factor. (Henceforth the restriction “greater than 1” will 
be understood.) Then P(m-+ 1) cannot be false, for if it were, um4i 
and Um42 would have a common factor, say d, and this would also be a 
factor of tm, since 


Um = Unm42 — Um+41- 


But then d would be a common factor of um and Um41, which is not the 
case. Hence the truth of P(m) implies that of P(m + 1) for every positive 
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integer m, and the induction axiom shows that the theorem displayed 
above is true. A 

This proof can easily be recast in terms of the well-ordering axiom. 
If the theorem is false, there is an n for which u, and u,41 have a common 
factor, and hence a smallest such n, say n = m. Now m is not 1, since 
uw; and ue have no common factor, and therefore m > 1. But if u, and 
Um+1 have the common factor d, then, by our previous reasoning, d is 
also a factor of t#m—1, and sO Uy and tm_; have a common factor. But this 
contradicts the definition of m. Hence the theorem is not false. A 

There are two useful variations on the induction principle which should 
be noted. In one, the theorem to be proved concerns the integers not 
less than some fixed integer no, rather than those not less than 1, and it 
is easily seen that the induction axiom implies the following principle: 


If P(n) is true for no, and if P(m-+ 1) is true for every integer 
m > nq for which P(m) is true, then P(n) is true for every integer 
n> No. 


The second variation allows for the case in which P(m + 1) cannot easily 
be deduced from P(m), but depends instead on P(k) for some k < m: 


If P(1) is true, and if P(m + 1) is true for every m > 1 for which 
all of P(1), P(2),..., P(m) are true, then P(n) is true for every n > 1. 


This version is obtained from Postulate 5 by taking for M the set of 
positive integers for which all of P(1), P(2),..., P(m) are true. 
To illustrate the use of the last version, consider the following theorem: 


For every positive integer n, Un < (%)". 


Let P(n) be the inequality (it is a sentence) “wn < (¥)"”, and for brevity 
set a = §. Then P(1) and P(2) are certainly true: uy < aanduz < a. 
But from the truth of P(m) we cannot deduce that of P(m + 1) directly, 
since from ttm < a” it follows only that up,—1 < a” also, and hence that 


Unt = Um + Umar <a + a™ = 2ol™ = Fam}, 


which is not the inequality we need. On the other hand, if we suppose that 
P(1),..., P(m) are all true, then we have for m > 2, 


Unt = Um + Umar <a” + a) = aI (1 +a) < a) a? = ot, 


sncelta='At<3<#=a7a 

There are then two parts to a proof by induction: verification of the 
sentence P(n) for the smallest relevant value no of the so called “induction 
variable” n, and proof of a certain implication, either “P(m) implies 
P(m + 1) for all integers m > no” or “[P(1) and P(2) and... and P(m)] 
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implies (m + 1) for all integers m > no.” Each of these implications has, 
in common with all implications, an hypothesis and a conclusion, and in 
the present context the former is called the “induction hypothesis.” The 
student should note carefully the difference between the theorem to be 
proved, “P(n) is true for all positive integers n,” and the induction step, 
“P(m) implies P(m + 1) for all positive integers m.” They are both 
assertions about a sentence P(n), but they are not the same. Moreover, 
although we frequently assume that P(n) is true when we prove the in- 
duction step, this is not the same thing as assuming the truth of the 
theorem (which would of course make the whole thing nonsense), since 
the theorem is not the sentence P(n) but the next to last sentence in quota- 
tion marks above about P(n). Strictly speaking, P(n) by itself is neither 
true nor false (because it contains the variable n), and does not become so 
until n is assigned a value. (The inequality “2 > 3” is false, the inequality 
“5 > 8” is true; the inequality “n > 3” is neither.) When we say, 
“Suppose that P(m) is true,” we really mean, “Suppose that m is an 
integer such that P(m) is true.” 


PROBLEMS 


When a number of quantities of the same general form are to be added or 
multiplied together, it is customary to abbreviate the sum and the product in 
the following way: 


a1 +az+--+a, =D) a, 


k=l 


n 
a102°'-+Gd, = I «. 
k=l 


Thus we would write 7(1) + 7(2) + -+-+ 7(n) as 


n 


> ®, 


k=l 
and could define the factorial n! by the equations 


1 for n = 90, 


nl = <n 
i k for na positive integer. 
kal 


1. Prove the following identities by induction: 


(a) ie is n(n + 1)(n+ 1). (b) 2H n aoe iy 


k=l 6 
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k=l 


() T+) 243; ay Sareea 
k=1 


2. Show that if n is a positive integer and z is a real number larger than —1, 
then (14+ 2)" > 1+ nz. 


3. Show that every integer greater than 1 can be represented as a product 
of one or more primes. 


4. Show that if a and b are positive integers, there is a positive integer n such 
that na > b. [Hint: consider the differences 6 — na, and apply the well- 
ordering axiom.] 


5. Define the binomial coefficients (7) for integers n and k withO<k<n 


by the equation 
i eee Ee 
k} kin — B! 


Show by direct computation that 


n n n+1 
()+G3,) = ey) pe ee 


Use this identity to prove the binomial theorem by induction: 


a” + (") a® "5 + (;) ab? es + fe - :) ab" +b" 


n 
= > (") a"*o* for positive integer n and arbitrary a and b. 
i=0 


(a+ b)" 


6. Show that a set S of n distinguishable elements has exactly 2” subsets, 
including the empty set and S itself. 


The assertion is sometimes made that mathematical induction is not useful 
for discovering new information, but only for verifying what has already been 
guessed. The following problems bear on this point. 


7. Re-examine the proof that u, < a” for all n, to find the smallest 6 such 
that you could prove that u, < 8" for all n, and carry out this proof. Can you 
prove an inequality in the opposite direction, of the form u, > cB", for some 
positive constant c? 


8. From the binomial theorem we have 


Qn 2n an 2n 2n 
pate a (3)+ +--+ +--+): 


and hence (2") < 22" = 4", Using the definition of the binomial coefficients, 


show that 
e 3 = a(s oss —) (**). 
n+1 2Qn+ 2 n 
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Deduce an inequality of the form 


) () > of" 


which is valid for n > 1, where ¢ and @ are specific numbers. Show also that 
for every 8 < 4 there is a positive constant c such that (*) holds for n > 1. 


1-4 Indirect proofs. There is a second kind of proof with which the 
reader may not have had much experience, the so-called indirect proof, 
or proof by contradiction. An assertion P is said to have been proved 
by contradiction if it has been shown that, by assuming P to be false, we 
can deduce an assertion Q which is known to be false or which contradicts 
the assumption that P is false. Several proofs by contradiction have 
already occurred above; for example, the proof that there is no integer 
between 0 and 1 and the deduction of the induction axiom from the well- 
ordering axiom were indirect. As another example, consider the theorem 
(known as early as the time of Euclid) that there are infinitely many 
prime numbers. To prove this by contradiction, we assume the opposite, 
namely that there are only finitely many prime numbers. Let these be 
Pi, ---} Pn; let N be the integer pipe --+ pn + 1; and let Q be the asser- 
tion that N is divisible by some prime different from any of the primes 
D1,-.-, Pn. Now N is divisible by some prime p (if N is itself prime, then 
p = N), and N is not divisible by any of the primes pi,..., pn, since 
each of these leaves a remainder of 1 when divided into N. Hence Q is 
true. Since Q is not compatible with the falsity of the theorem, the theorem 
is true. (Note that the assertion that N is divisible by some prime requires 
proof. This is easily given by induction; see Problem 3 of Section 1-3.) 

Since we are momentarily concerned with logic, it might be helpful to 
say a word about implications. The assertion “P implies Q,” where P 
and Q are sentences, means that Q can be derived from P by logically correct 
steps (more precisely: from P and the axioms of the system with which 
one is concerned). It says nothing about P and Q individually, but it 
makes a statement about a relationship between them. It can also be 
interpreted as meaning “whenever (or if) P is true, so is @.” It can be 
proved either by starting from P and deducing Q, or by starting from “Q 
is false” and deducing “P is false,” the latter being an indirect proof. 
(See, for example, the inductive step in the proof that u, and uz+1 have 
no common factor.) If P implies Q, then Q is said to be a necessary condi- 
tion for P, since Q necessarily happens whenever P does, and P is said to 
be a sufficient condition for Q, since the truth of P guarantees (is sufficient 
for) that of Q. If P implies Q and Q implies P, then P and Q are said to 
be equivalent statements, each is said to be a necessary and sufficient con- 
dition for the other, and we say that one is valid if and only if the other is. 
For example, for a number larger than 2 to be prime, it is necessary, but 
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not sufficient, that it be odd. In order that a polynomial assume both 
positive and negative values for appropriate values of x, it is sufficient, 
but not necessary, that it be of odd degree. It isa famous theorem of 
P. Dirichlet that there are infinitely many primes among the numbers m, 
m-+k, m+ 2k, m+ 8k,...,if and only if m and k have no common 
factor larger than 1. A necessary and sufficient condition for an integer 
to be divisible by 9 is that the sum of its digits be divisible by 9. A number 
is a square only if its final digit is one of 0, 1, 4, 6, or 9. 

The above is a very brief introduction to the logic of mathematics, but 
it will suffice for this book. However, one more point should be made about 
the proofs encountered in elementary number theory, which verges more 
toward the psychological. It is a well-known phenomenon in mathe- 
matics that an excessively simple theorem frequently is difficult to prove 
(although the proof, in retrospect, may be short and elegant) just because 
of its simplicity. This is probably due in part to the lack of any hint in 
the statement of the theorem concerning the machinery to be used in 
proving it, and in part to the lack of available machinery. Many theorems 
of elementary number theory are of this kind, and there is considerable 
diversity in the types of arguments used in their proofs. When we are 
presented with a large number of theorems bearing on the same subject 
but proved by quite diverse means, the natural tendency is to regard the 
techniques used in the various proofs as special tricks, each applicable 
only to the theorem with which it is associated. A technique ceases to be a 
trick and becomes a method only when it has been encountered enough times 
to seem natural; correspondingly, a subject may be regarded as a “bag of 
tricks” if the ratio of techniques to results is too high. Unfortunately, 
elementary number theory has sometimes been regarded as such a subject. 
On working longer in the field, however, we find that many of the tricks 
become methods, and that there is more uniformity than is at first appar- 
ent. By making a conscious effort to abstract and retain the core of the 
proofs that follow, the reader will begin to see patterns emerging sooner 
than he otherwise might. 

Consider, for example, the assertion that 7(n) is even unless n is a square, 
i.e., the square of another integer. This can be proved as follows: If d 
is a divisor of n, then so is the integer n/d. If n is not a square, then 
d # n/d, since otherwise n = d?. Hence, if n is not a square, its divisors 
can be paired off into couples d, n/d, so that each divisor of n occurs just 
once as an element of some one of these couples. The number of divisors is 
therefore twice the number of couples and, being twice an integer, is even. 

We have here applied the principle that in counting integers having a 
certain property (here “counting” may be replaced by “adding”), we may 
find it helpful first to group them in judicious fashion. There are several 
problems in the present book whose solutions depend on this idea. 
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PROBLEMS 


1. Show that r(n) is odd if n is a square. 

2. Anticipating Theorem 1-1, suppose that every integer can be written in 
the form 6k + 1, where & is an integer and r is one of the numbers 0, 1, 2, 3, 4, 5. 
(a) Show that if p = 64-+ risa prime different from 2 and 3, then r = 1 or 5. 
(b) Show that the product of numbers of the form 6% ++ 1 is of the same form. 
(c) Show that there exists a prime of the form 6k — 1 = 6(k — 1) + 5. 
(d) Show that there are infinitely many primes of the form 6k — 1. 


1-5 Radix representation. Although we have assumed a knowledge of 
the structure of the system of integers, we have said nothing about the 
method which will be used to assign names to the integers. There are, of 
course, various ways of doing this, of which the Roman and decimal sys- 
tems are probably the best known. While the decimal system has obvious 
advantages over Roman numerals, and the advantage of familiarity over 
any other method, it is not always the best system for theoretical pur- 
poses. A rather more general scheme is sometimes convenient, and it is 
the object of the following two theorems to show that a representation 
of this kind is possible, ie., that each integer is given a unique name. 
Here, and throughout the remainder of the book, lower-case Latin letters will 
denote integers, except where otherwise specified. 


TuroreM 1-1. If a is positive and b is arbitrary, there is exactly one 
pair of integers g, r such that the conditions 


=gat+tr, O<r<a, (6) 
hold. 


Proof: First, we show that (6) has at least one solution. 
Consider the set D of integers of the form b — ua, where wu runs over 
all integers, positive and nonpositive.. For the particular choice 


ie sae fs ifb > 0, 

b ifb <0, 
the number 6 — ua is nonnegative, so that D contains nonnegative ele- 
ments. The subset consisting of the nonnegative elements of D therefore 
has a smallest element. Take r to be this number, and gq the value of u 
which corresponds to it; ie., let q be the largest integer such that 

b — ga > 0. Thenr = b — ga > 0, whereas 

r—-a=b—(¢+)a <0; 


hence (6) is satisfied. 
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To show the uniqueness of g and r, assume that q’ and r’ also are inte- 
gers such that 
b=qda+r, 0O<r<a 


Then if 9 < q, we have 
b—qda=r>b—(q-—la=r+a2za, 


and this contradicts the inequality r’ < a. Hence q’ > q. Similarly, 
we show that q > q’. Therefore gq = q’, and consequently r =r’. A 


Turorem 1-2. Let g be greater than 1. Then each integer a greater 
than 0 can be represented uniquely in the form 


a = Co + cig ++++ + ng”, 
where ¢, is positive and 0 < c, < gfor0 < m <n. 


Proof: We prove the representability by induction on a. For a = 1, 
we have n = O and co = 1. 

Take a greater than 1 and assume that the theorem is true for 1, 2,-.., 
a — 1. Since g is larger than 1, the numbers g®, g!, g?,... form an in- 
creasing sequence, and any positive integer lies between some pair of suc- 
cessive powers of g. More explicitly, there is a unique n > 0 such that 
g” <a <g"*!, By Theorem 1-1, there are unique integers c, and r 
such that 

a= tag” +7, O<r< qg". 


Here c, > 0, since cng” = a — r > g” — g” = 0; moreover, ¢, < g 
because c,g” <a < g"t+!. If r = 0, then 


a=O+0-gte+++0+g" 7! + eng”, 
whereas if r is positive, the induction hypothesis shows that r has a repre- 
sentation of the form 
r=botbdgt--- + by", 
where b; is positive and 0 < bn < gforO0 < m <t. Moreover, t < n. 
Thus 
a= bo tbig+--- + bg +0-g't? +---4+0-g"* + cng”, 
where the terms with coefficient zero occur only if +1 <n. Now use 
the induction principle. 


To prove uniqueness, assume that there are two distinct representations 
for a: 


@= coreg t:--+eg” = dot digt---+dg", 
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with n > 0, ta > 0, and 0 < ec, < g for 0 < m < n, and also r > 0, 
d, > 0, and 0 < dn < g for 0 < m <r. Then by subtracting one of 
these representations of a from the other, we obtain an equation of the 
form 


0 = eo tegt--- + eg’, 


where s is the largest value of m for which cm # dm, so that e, ¥ 0. 
If s = 0, we have the contradiction eg = e, = 0. If s > 0 we have 


lem = lem — dnl < g — 1, 0<m<s-—l, 
and 


esg° = —(€o af tee cag); 
so that 


g’ < leeg’| = leo +--+ + es—19° "| < Jeol +--+ + lee—alg?? 
<@g—-—DYAt+g+--+ 9 =9 -1, 


which is also a contradiction. We conclude that n = r and tm = dm 
for 0 < m < n, and the representation is unique. A 

By means of Theorem 1-2 we can construct a system of names or sym- 
bols for the positive integers in the following way. We choose arbitrary 
symbols to stand for the digits (i.e., the nonnegative integers less than g) 
and replace the number 


Co + Cig + +++ + eng” 


by the simpler symbol cncn_1 ...¢1¢o. For example, choosing g to be ten, 
and giving the smaller integers their customary symbols, we have the 
ordinary decimal system, in which, for example, 2743 is an abbreviation 
for the value of the polynomial 2z* + 72? + 42 + 3 when z is ten. But 
there is no reason why we must use ten as the base, or radix; if we used 
7 instead, we would write the integer whose decimal representation is 
2743 as 10666, since 


27438 = 64+6-746-77+0-77 41-74. 


To indicate the base being used, when it is different from ten, we might 
use @ subscript (in the decimal system), so that for example, 


2743 = (10666). 


Of course, if the radix is larger than 10, it will be necessary to invent 
symbols to replace 10, 11,..., g — 1. For example, taking g = 12 and 
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setting 10 = a, 11 = 8, we have 
(14)12 + (7)12 = (18) 12 


and 
(31)12* (@)12 = 37-10 = 370 = (26a)j9. 


In addition to the usual base, 10, the numbers 2 and 12 have received 
serious attention as useful bases. The proponents of the base 12 (the 
duodecimal system, as it is called) argue that 12 is a better base than 10 
because in the duodecimal system many more fractions have terminating 
decimal (or rather, duodecimal) expansions [e.g., 1/2 = (0.6)12, 1/3 = 
(0.4)12, 1/4 = (0.3)12, 1/6 = (0.2)12, 1/12 = (0.1)z2], large numbers 
could be written in shorter form, and some systems of measurement: (e.g., 
feet and inches) are already duodecimal. Be that as it may, and counter- 
arguments certainly exist, there does not seem to be the slightest chance 
of such a “reform” occurring, so the subject must remain in the realm of 
idle speculation. 

The base 2 is another matter completely. The binary system, consisting 
of only two digits, 0 and 1, is in constant use today in the scientific world, 
specifically in modern high-speed computers; in these machines the two 
binary digits correspond to the physical alternative that something is or 
is not the case: current is or is not flowing, a spot on a magnetic tape is 
or is not magnetized, etc. If we liken digits to colors, we might say that 
in the binary system we can see only black and white, whereas in the 
decimal system the digits distinguish ten shades, from white through 
gray to black. In this sense a binary digit carries less information than a 
decimal digit, a fact reflected in the far greater number of binary digits 
required to express any number which is at all large; for example, 


1024 = (10,000,000,000) 2. 


The machine experts have neatly summarized this situation by abbreviat- 
ing “binary digit” to “bit,” indicating that one binary digit is one bit of 
information. 

What may seem at first sight to be a disadvantage, namely that a large 
number of bits is required to represent a significant amount of information, 
is in fact only the other side of the coin of versatility; a bit is like a brick, 
in that it takes a lot of them to make anything interesting, but a very 
wide range of things can be made out of them, exactly because they have 
so little built-in structure. 
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PROBLEMS 


1. (a) Show that using only the standard weights 1, 2, 2?,..., 2", one can 
weigh any integral weight less than 2*+! by putting the unknown weight on 
one pan of the balance and a suitable combination of standard weights on the 
other pan. 

(b) Prove that no other set of n + 1 weights will do this. [Hint: Name the 
weights so that wo < wi <-++ < wa. Let & be the smallest index such that 
w, # 2* and obtain a contradiction, using the fact that the number of nonempty 
subsets of a set of nm + 1 elements is 2+! — 11] 

2. Construct the addition and multiplication tables for the duodecimal digits, 
ie., the digits in base 12. Using these tables, evaluate 


(2109) 12° (8370) 12. 


8. To multiply two numbers, such as 37 and 22, set up a table according to 
the following pattern: 


37 22 
18 44 
9 88 
4 176 
2 352 
1 704 


The first column is formed by successive halvings (fractional remainders are 
discarded whenever they occur) and the second by successive doublings. If the 
elements of the second column standing opposite odd numbers in the first are 
added together, the result is 22 + 88 -++ 704 = 814 = 22-37. Use the binary 
representation to show that this rule is general. 


4. Let u1, ue,... be the Fibonacci sequence defined in Section 1-3. 
(a) Prove by induction (or otherwise) that for n > 0, 


Un—-1 + Un—3 + Un—5 tees < Un; 


the sum on the left continuing so long as the subscript remains larger than 1. 

(b) Show that every positive integer can be represented in a unique way in 
the form tn, + tng e+e + Unp where k > 1, nj-1 > nj + 2 for j = 2, 
3,...,%, and m > 1. 


CHAPTER 2 
THE EUCLIDEAN ALGORITHM AND ITS CONSEQUENCES 


2-1 Divisibility. Let a be different from zero, and let b be arbitrary. 
Then, if there is a c such that b = ac, we say that a divides b, or that a 
is a divisor of b, and write alb (negation: atb). As usual, the letters involved 
represent integers. 

The following statements are immediate consequences of this definition: 


(1) For every a # 0, aj0 and ala. For every 6, +1). 

(2) If alb and dle, then alc. 

(3) If a|b and alc, then al(bz + cy) for each z, y. (If alb and alec, than a 
is said to be a common divisor of b and c.) 

(4) If alb and b ¥ 0, then |a| < OJ. 


2-2 The Euclidean algorithm and greatest common divisor. 


THEOREM 2-1. Given any two integers a and b not both zero, there is 
a unique integer d such that 


(i) d > 0; 
(ii) dla and d\b; 
(iii) if d, is any integer such that d,|a and d,|b, then d,|d. 


Property (ili) says that every common divisor of a and b divides d; 
from assertion (4) above, it follows that d is the numerically largest of 
the various divisors of a and b. Thus, among the common divisors of a 
and b, d is maximal in two different senses, and hence is called the greatest 
common divisor of a and b. We abbreviate this statement by saying that 
the GCD of a and b is d, and writing simply (a,b) = d. The nomen- 
clature is somewhat misleading, because “greatest” seems to refer to size, 
whereas it is actually the maximality of d in the sense of (ii) which is im- 
portant, and not its size. 


Proof: First let a and b be positive, and suppose that a is the larger of 
the two numbers; otherwise we can simply interchange their names. 
By Theorem 1-1, there are unique integers g; and r; such that 


a=oba4+hnr, O<r, <b. 
22 
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If r,; = 0, then 5 is a divisor of a and we can take d = |, insofar as condi- 
tions (i) through (iii) above are concerned: 6 is positive, it is a common 
divisor of a and b, and every common divisor of a and b is a divisor of b. 
We shall return to the question of uniqueness below. 

If r, ¥ 0, then repeated application of Theorem 1-1 shows the existence 


of unique pairs go, re; @3, 73;-.-, Such that 
a=ba-+ri, 0<r, <b, 
b = riqo + 72, 0 < re < 11, 
ry = 1293 + Ts, 0 < 13 < rp, 
Th-3 == Tkh-2Qk—1 + Tk-1, O < r_1 < Th-2; 
Th-2 = Th-19k + Thy 0 <r < Tr-1, 
Th-1 = ThQk+1- 


Here we are confronted at each stage with the possibility that the re- 
mainder is zero, but we have assumed that this does not happen until the 
kth stage, when we divide r,_1 by 7; or, to put it the other way around, 
we define k as the number of the stage at which a zero remainder appears. 
The process must stop then, of course, since Theorem 1-1 does not provide 
for division by zero. On the other hand, a zero remainder must eventually 
occur, since each remainder is a nonnegative integer strictly smaller than 
the preceding one, and the existence of an infinite sequence of such num- 
bers contradicts the well-ordering axiom. Thus if bla, there is always a 
finite system of equations of the kind above, and a last nonzero remainder 
ry. We assert that to satisfy conditions (i) through (ili) we can take d = ry. 
For from the last equation we see that r;{r,_1; from the preceding equation, 
using statement (2) of Section 2-1, we see that rxjrz—2, etc. Finally, from 
the second and first equations, respectively, it follows that r,|b and r;la. 
Thus r;, is a common divisor of a and b. Now let d; be any common divisor 
of a and b. From the first equation, d;|r,;; from the second, d|rq; etc.; 
from the next-to-last equation d;|r,. Thus we can take the d of the 
theorem to be rx. 

If a < b, interchange the names of a and b. If either a or 6 is negative, 
find the d corresponding to |a| and |b|. If a is zero, (a, b) = [bl. 

If both d, and dz have the properties of the theorem, then d,, being a 
common divisor of a and b, divides dz. Similarly, dg|d;. This clearly im- 
plies that dj = dz, and the GCD is unique. A 

The chain of operations indicated by the above equations is known as 
the Euclidean: algorithm; as will be seen, it is a cornerstone of multiplica- 
tive number theory. (In general, an algorithm is a systematic procedure 
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which is applied repeatedly, each step depending on the results of the 
earlier steps. Other examples are the long-division algorithm. and the 
square-root algorithm.) The Euclidean algorithm is actually quite prac- 
ticable in numerical cases; for example, if we wish to find the GCD of 
4147 and 10672, we compute as follows: 


10672 = 4147-2 + 2378, 
4147 = 2378-1 + 1769, 
2378 = 1769-1-+ 609, 
1769 = 609-24 5651, 

609 = 551-1+ 58, 
551 = 58-9+ 29, 
58 = 29-2. 


Hence (4147, 10672) = 29. 

It is frequently important to know whether two integers a and b have 
a common factor larger than 1. If they have not, so that (a,b) = 1, we 
say that they are relatively prime, or prime to each other. 

The following properties of the GCD are easily derived either from the 
definition or from the Euclidean algorithm. 


(a) The GCD of more than two numbers, defined as that positive 
common divisor which is divisible by every common divisor, exists 
and can be found in the following way. Let there be » numbers 
4, @2,..-., An, and define 


Dy = (a1, a2), Do = (Di, 43),..., Dai = (Dn—2; Qn). 


Then (ai, @a,..-,@n) = Dai. 

(b) (ma, mb) = m(a, b) if m > 0. 

(c) If mla and mlb, then (a/m, b/m) = (a, b)/m, provided m > 0. 

(d) If (a,b) = d, there exist integers z, y such that az + by = d. 
{This last statement has an important consequence, namely, if a 
and b are relatively prime, there exist z, y such that az + by = 1. 
Conversely, if there is such a representation of 1, then clearly 
(a,b) = 1.] 

(e) If a given integer is relatively prime to each of several others, it 
is relatively prime to their product. For example, if (a,b) = 1 
and (a,c) = 1, there are z, y, t, and w such that ax + by = 1 and 
at + cu = 1, whence 


ax -+ by(at + cu) = a(x + byt) + be(yu) = 1, 
and therefore (a, bc) = 1. 
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The Euclidean algorithm can be used to find the x and y of property 
(d). Thus, using the numerical example above, we have 


29 = 551 — 58-9 (58 = 609 — 551-1) 
= 551 — 9(609 — 551-1) 
= 10-551 — 9-609 (551 = 1769 — 2-609) 
= 10(1769 — 2-609) — 9-609 
= 10-1769 — 29-609 (609 = 2378 — 1- 1769) 
= 10-1769 — 29(2378 — 1- 1769) 
= 39-1769 — 29 - 2378 (1769 = 4147 — 2378) 
== 39(4147 — 2378) — 29 - 2378 
= 39-4147 — 68 - 2378 (2378 = 10672 — 2- 4147) 


== 175-4147 — 68 - 10672, 


so that x = 175, y = —68 is one pair of integers such that 41472 + 
10672y = 29. It is not the only such pair, as we shall see in Section 2-4. 


PROBLEMS 


. Show that if ad and b = 0, then [a] < [OJ. 

. Show that (a, 6) = (a,b-+ ka) for every k. 

. Show that if (a, b) = 1, then (a — b,a+ 6) = lor2. 
. Show that if az + by = m, then (a, b)|m. 

. Prove assertions (a) through (e) of the text. 

6. (a) Evaluate (4655, 12075), and express the result as a linear combination 
of 4655 and 12075; that is, in the form 46552 + 12075y. (b) Do the same for 
(1869, 2597). (c) Do the same for (2048, 1275). 

7. Show that no cancellation is possible in the fraction 


a1 + a2 
b1 + be 


aor GO be 


if aybe — aed) = +1. 
8. Evaluate the following: 
(a) (493, 731, 1751); (b) (4410, 1404, 8712); (c) (703, 893, 1729, 33041). 
9. Show that if bla, cla, and (6, c) = 1, then bela. 

10. Show that if (b,c) = 1, then (a, be) = (a, b)(a,c). [Hint: Prove that 
each member of the alleged equation divides the other. Use property (d) in 
the text, and the preceding problem.] 

11. In the notation introduced in the proof of Theorem 2~1, show that each 
nonzero remainder rp, With m >. 2, is less than rm—2/2. [Hint: Consider separately 
the cases in which 7,,—1 is less than, equal to, or greater than r,,-2/2.] Deduce 
that the number of divisions in the Euclidean algorithm is at most 2n + 1, 
where n is that integer such that 2" < b < 2"+1, and where 6 is the smaller of 
the two numbers whose GCD is being found. 
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12. (a) Let D be the smallest positive number which can be represented in the 
form az + by with integers x and y. Show that if c is any integer representable 
in this form, then D|c. (Hint: Apply Theorem 1-1 and show that the remainder 
upon dividing ¢ by D must be zero, because of the minimality of D.} (b) Show 
that Dia and D\b. (c) Prove Theorem 2-1 without using the Euclidean algorithm. 

13. Use the method of the preceding problem to prove the existence and 
uniqueness of an appropriately defined GCD of several integers ai,..., dn, 
not all of which are zero. 

14, Extend assertions (b) through (e) of the text to the case of several in- 
tegers. 


2-3 The unique factorization theorem. 


THEOREM 2-2. Every integer a > 1 can be represented as a product 
of one or more primes. (It is customary to allow products to contain 
only one factor, and sums to contain only one term, since this simplifies 
the statements of theorems.) 


Proof: The theorem is true for a = 2. Assume it to be true for 2, 
3,4,...,@— 1. If ais prime, we are through. Otherwise a has a divisor 
different from 1 and a, and we have a = be, with1 <b <a,l <c <a. 
The induction hypothesis then implies that 


8 £ 
b= [In c= IL %, 
i=1 i=1 


with pi, pi’ primes, and hence a = pip2::: psp’: ++ pi’. & 

Any positive integer which is not prime and which is different from 
unity is said to be composite. Hereafter p will be used to denote a positive 
prime number, unless otherwise specified. 


TurorEM 2-3. If albc and (a, b) = 1, then ale. 


Proof: If (a,b) = 1, there are integers x and y such that ax + by = 1, 
or aczr + bey = c. But a divides both ae and be, and hence the left side 
of this equation, and therefore a divides c. A 


TueorEeM 2-4. If 
P| IL pm, 
m=1 


then for at least one m, we have p = Pm. 


Proof: Suppose that plpipe--- pn but that p is different from any of 
the primes p;, D2,.--, Pn—1- Then p is relatively prime to each of pi, ..., 
Pn—1 and so, by property (e) of the preceding section, is relatively prime 
to their product. By Theorem 2-3, pipn, whence p = pn. A 
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THEOREM 2-5. (Unique Factorization Theorem). The representation of 
a > 1 asa product of primes is unique up to the order of the factors. 


Proof: We must show exactly the following. From 


ny ne 
a= [[ p= IL vn (iSpeS-*-S pms pi Spo S-*+ S Dh), 


Ment 


it follows that n; = nz and pm = Pmforl <m< ny. 

For a = 2 the assertion is true, since ny = mz = 1 and pj = pi = 2. 
For a > 2, assuming the assertion to be correct for 2, 3,...,,@ — 1, 
we find: 

(a) If ais prime, n; = 1, py = pi = a. 

(b) Otherwise ny > 1,22 > 1. From 


it follows by Theorem 2~4 that for at least one r and at least one s, 


Pi = Pr, = Pl = Ds. 
Since 


Pi SPr= pis De= P1, 


we have py = pj. Moreover, since 1 < p,; < aand p,\a, we have 
a ny 
1 < = Sa! [fas - ii Din < a. 


Thus the products pop3 ++ - Pn, and p2p3° ++ pn, are prime decompositions 
of the same number, and this number lies in the range in which, by the 
induction hypothesis, factorization is unique. Hence n; = n2 and p; = pi, 
p2 = pz,...- Thus the two representations for a were identical. A 

In view of its fundamental position in the theory of numbers, we give a 
second proof of the unique factorization theorem, this one being independent 
of the notion of GCD. As a preliminary step we note that if an integer n 
has the unique factorization property, and if a prime p divides n, then p 
actually occurs in the prime factorization of n; for otherwise we could 
write n/p as a product of primes, not necessarily unique, and multiplying 
through by p would yield a second representation for m as a product of 
primes. 

Primes, by definition, have unique factorization; so let us consider an 
integer n > 1 which is not prime and let us suppose, as induction hy- 
pothesis, that all integers a with 1 < a < n have the unique factoriza- 


28 THE EUCLIDEAN ALGORITHM AND ITS CONSEQUENCES {cuap. 2 


tion property. Suppose that n does not have it, and that we have the two 
representations 


bs 


ees 


n= pip2*** = Pip 


where we again order the factors so that p1 < po <---and pi < pa sree. 
We can suppose that no p; is the same as any pi, since otherwise the 
common factor could be cancelled and the induction hypothesis applied. 
Because there are at least two factors in each representation, we have 


n>pi2>pi or oP SVN, 


and similarly pi < ./n. Thus the number a = n — pypj is nonnegative 


If a were 0, we would have 
n= Pip = Pip2**'; 
Pi = Paes’, 


and hence pj == pez, contrary to our assumption. Therefore a > 1. But 
we also find that a ¥ 1, since a = 1 would given = pip; + 1, a number 
not divisible by pi. Hence a > 1. By the induction hypothesis, a has 
unique factorization, and since both p; and pj divide a, it follows from the 
preliminary remark that both of these primes must actually occur in the 
factorization of a. Furthermore, they are distinct, and consequently a = 


pipib, where b is a positive integer. But then 


n=a-+t pips = pipi(o + 1) = pipe: 
pi(tb + 1) = po---, 


and since pj--- is a number with unique factorization and divisible by 
pi, it must be that pi is one of the primes po, ..., contrary to our hy- 
pothesis. This contradiction shows that n has unique factorization, and 
it follows from the induction axiom that all integers larger than 1 have 
this property. A 

At this point the question might well be raised, why all the fuss about 
a theorem whose truth seems perfectly obvious? The answer is, of course, 
that it seems obvious only because one is accustomed to it from experi- 
ence with the small integers, and that one therefore believes that it is also 
true for larger integers. But believing and knowing are not the same 
thing. 

It might be instructive to consider a situation rather similar to the one 
we have been concerned with, in which factorization is not unique. In- 
stead of taking all the positive integers as our domain of discussion, sup- 
pose that we consider only those of the form 4k + 1, namely 1, 5, 9,13,.... 
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Call this set of integers D. The product of two elements of D is again in 
D, since 


(4k + 1)(4m + 1) = 4(44km +h +m) +1. 


We could say that an element of D is prime in D if it is larger than 1 
and has no factors in D except itself and 1; thus the first few numbers 
which are prime in D are 5, 9, 18, 17, 21, 29,.... It is now quite straight- 
forward to show that every integer greater than 1 in D can be represented 
as a product of integers prime in D, but the unique factorization theorem 
does not hold, since, for example, 441 can be represented as products of 
numbers prime in D in two distinct ways: 217 and 9-49. The difficulty 
here is that D is not large enough, i.e., it does not contain the numbers 3 
and 7, for example, which would be necessary to restore the unique factor- 
ization of 441. There is also no reason to suppose that the full set of 
integers is large enough, until it has been proved to be the case. 


PRoBLEMS 


1. Show that if the reduced fraction a/b is a root of the equation 
cox” + cya"! + ++--+ cn = 0, 


where 2 is a real variable and cg, c1,..., Cn are integers with co ¥ 0, then alcn 
and blco. In particular, show that if & is an integer, then Vk is rational if and 
only if it is an integer. 

2. The unique factorization theorem shows that each integer a > 1 can 
be written uniquely as a product of powers of distinct primes. If the primes . 
which do not divide a are included in this product with exponents 0, we can 


write 
LJ 
oe 
a = [[ 2%, 
i=1 


where p; is the ith prime, a; > 0 for each 7, a; = 0 for sufficiently large 7, and 


the a,’s are uniquely determined by a. Show that if also 


then 


(ab) = [[ pre, 


i=l 


where the symbol min (a, 8) means the smaller of a and 8. Use this fact to give 
different solutions to Problems 9 and 10, Section 2-2. 
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3. Show that the Diophantine equation 
2 —y2 =N 


is solvable in nonnegative integers z and y if and only if N is odd or divisible by 4. 
Show further that the solution is unique if and only if |N| or |N|/4, respectively, 
is unity or a prime. (Hint: Factor the left side.] 

4. Show that every integer can be uniquely represented as the product of a 
square and a square-free number, the latter being an integer not divisible by the 
square of any prime. 

5. Suppose that there are A primes not exceeding the positive integer z, so 
that w(z) = h. How many square-free numbers composed of one or more of 
these primes are there? How many squares not larger than z are there? Using 
the result of Problem 4, deduce that 


log x 


m(z) 2 2 log 2 


6. Show that the number 1 + 1/2 + 1/3-+-+-+ 1/n is not an integer for 
n > 1. [Hint: Consider the highest power of 2 occurring among 2, 3,..., n, 
and show that it occurs in only a single term.] 

7. Suppose that n = []j—1 p.:%, where now the p,; are the primes actually 
dividing n, so that a; > 0 for1 <i< r. Show that every positive divisor of 
nis to be found exactly once among the terms resulting when the product 


r 


ILa+at---+2% 


t=1 


is multiplied out. Deduce that the sum of the positive divisors of n is 


@nd that the number of divisors of n is [][j—1 (a;-+ 1). Use the latter result to 
give a new proof that r(n) is odd if and only if n is a square. 


2-4 The linear Diophantine equation. For simplicity, we consider only 
the equation in two variables 


az + by =. (1) 


It is easy to devise a scheme for finding an infinite number of solutions 
of this equation if any exist; the procedure can best be explained by means 
of a numerical example, say 52 + 22y = 18. Since z is to be an integer, 
4(18 — 22y) must also be integral. Writing 


— 18 —22y 3 — 2y 
Pen Rose Se =3 4y + 5 


? 
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we see that 4(3 — 2y) must also be an integer, say z. This yields 


We now repeat the argument, solving as before for the unknown which 
has the smaller coefficient: 


=1, z2=1— 2. 


Clearly, z will be an integer for any integer t, and we have 


Oy fe) ae 
2 
po BSB ig oe 


What we have shown is that any solution z, y of the original equation must 
be of this form. By substitution, it is immediately seen that every pair 
of numbers of this form constitutes a solution, so that we have a general 
solution of the equation. 

The same idea could be applied in the general case, but it is somewhat 
simpler to adopt a different approach. First of all, it should be noted 
that the left side of (1) is always divisible by (a, b), so that (1) has no 
solution unless dic, where d = (a,b). If this requirement is satisfied, we 
can divide through in (1) by d to obtain a new equation 


ade+tby=c', (2) 


where (a’, b’) = 1. We now use property (d) of Section 2-2 to assert the 
existence of numbers x and yg such that 


a'ty + b’'yo = 1, 
whence c’xg, c’yg is a solution of (2). We put zo = e’xg and yo = c’yo. 
Now suppose that 2, y; is any solution of (2). We have 
ato t+ Wy =e’, 
ar t+by, =, 
and, by subtraction, 


a'(tq — 21) = b’(y1 — Yo)- 
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Thus a’|b’(yi — Yo), and since (a’, b’) = 1, it must be that a’|(yi — yo). 
Similarly, b’|(z9 — x), and since 


Xo 1% = —bd't, 
Yo-—y = at, 
or 
v1, = Xo + bt, 
Y1 = Yo — at. 


Conversely, if x; and y; are related to a solution x9, yo of (2) as in the 
equations just written, then 


a’, + b’y, = (a’r9 + a’d’'t) + (b'yo — a'b't) = a'z9 + b'yo = c’" 


and so 21, 1 is also a solution of (2). Since every solution of (1) is a solu- 
tion of (2) and conversely, we have the following theorem. 


THEorEM 2-6. A necessary and sufficient condition for the equation 
ax + by = 


to have a solution z, y in integers is that dlc, where d = (a,b). If there 
is one solution, there are infinitely many; they are exactly the numbers 
of the form 


b 


t= to + Gh ¥=Yo—- 5, 


ale 


where t is an arbitrary integer and xo, yo is any one solution. 


There are various ways of obtaining a particular solution. Sometimes 
one can be found by inspection; if not, the method explained at the begin- 
ning of the section may be used or, what is almost the same thing, the 
Euclidean algorithm may be applied to find a solution of the equation 
which results when the original equation is divided by (a,b). The latter 
process of successively eliminating the remainders in the Euclidean al- 
gorithm can be systematized, but we shall not do this here. 

There are many “word problems” which lead to linear Diophantine 
equations that must be solved in positive integers, since only such solu- 
tions have meaning for the original problem. Suppose that the equation 
ax + by = ¢ is solvable in integers. Then we see from Theorem 2-6 
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that a positive solution will exist if and only if there is an integer ¢ such 
that both 
zo +5t>0 and Yo ~ Gt> 0. 

Let us first assume that a and b have opposite signs. Then the co- 
efficients of ¢ in the above inequalities have the same sign, and so either 
both require t to be not too small or both require ¢ to be not too large; 
namely, if b > 0 and a < 0, we must have 


poe gaa ee, 
b a 
whereas if b < 0 and a > 0, we must have 
hei Zot and ot < ¥& vot, 


In either case there is clearly an integer baie the requirements, and 
in fact either all integers smaller than a certain one, or all integers larger 
than a certain one, will do. Hence, in this case, there are always infinitely 
many positive solutions of the equation. 

The situation is quite different when @ and b have the same sign. We 
can suppose that a and b are both positive, since otherwise we can multiply 
through in the original equation by —1. Then there is a positive solution 
if and only if there is an integer ¢ such that 


— fol cy < tet dol, 


and the number of positive solutions is the number of integers in this 
interval. 


Exampie. A sporting-goods store placed a total order of $2490.00 
for a number of bicycles at $29 each and a number at $33 each. How many 
bicycles of each kind were ordered? 

We obtain the equation 292 + 33y = 2490. From the Euclidean al- 
gorithm, 

33 = 29-1-+4 4, 


29=7-4+1, 
we have 
1 = 29 — 7-4 = 29 — 7(33 — 29) = 29-8 — 33-7, 
and therefore a general solution of the equation is 
x = 8- 2490 + 33, 
y = —7- 2490 — 290. 
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The positive solutions correspond to integers f such that 


8 - 2490 7 - 2490 
age eee Cogs 
or —603.6... <¢ < —601.03.... Thus there are two solutions, cor- 
responding to £ = —602 and t = —603: 


x= 54, y = 28, 
or 
x= 21, y = 57. 


PROBLEMS 


1. Find a general solution of the linear Diophantine equation 
20722 + 1813y = 2849. 


2. Find all solutions of 19% + 20y = 1909 witha > 0,y > 0. 
*3. Let m and n be positive integers, with m < n, and let xo, 21,..., 7% be 
all the distinct numbers among the two sequences 


m n 
Joe ess eee. and eR et aaa lew | 
m n 


arranged so that xo < 21 <-+-+ < 2,. Describe & as a function of m and n. 
What is the shortest distance between successive z's? 


4. Let a and b be positive relatively prime integers. Then for certain non- 
negative integers n (which we shall briefly refer to as the representable integers), 
the equation az-+ by = n has a solution with zt > 0, y 2 0, whereas for 
other n it does not have such a solution. For example, if n = 0, 3, 5, or 6, or if 
n> 8, then 3x-+ 5y = n has such a solution. Show that this example is 
typical, in the following sense: (a) There is always a number N(a, b) such that 
for every n > N(a, b), n is representable. (It may be helpful to combine the 
theory of the present section with the elementary analytic geometry of the line 
ax -+ by = c, interpreting x and y in the latter case as real variables. Note 
that so far it is only the existence of N(a, b) which is in question, and not its 
size.) *(b) The minimal value of N(a, b) is always (a — 1)(6b — 1). *(e) Exactly 
half the integers up to (a — 1)(b — 1) are representable. 

5. Apply the method discussed in the text, of repeatedly solving for the un- 
known with smallest coefficient, to solve the equation 1321z + 5837y + 19262 = 
2983. 

6. Find necessary and sufficient conditions that the Diophantine equation 
ayti + ++++ ant, = b should have an integral solution. 

7. When Mr. Smith cashed a check for x dollars and y cents, he received 
instead y dollars and x cents, and found that he had two cents more than twice 
the proper amount. For how much was the check written? 
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2-5 The least common multiple. 


THEOREM 2-7. The number 


(a,b) = 1 


has the following properties: (1) (a,b) > 0; (2) al(a,b) and bj(a, b) ; 
(3) If alm and b|m, then (a, b)|m. 


Proof: (1) Obvious. 


(2) Since (a, b)|b, we can write 


(@,d) = lal Pt, 


and hence al|(a, b). Similarly, 


(od) = Bl 24, 


and so bj(a, b). 


(3) Let m = ra = sb, and set 
d = (a,b), = ad, b = bid. 
Then 
m = rajd = sbid; 
thus a,|sb;, and since (a1, 61) = 1, we must have a,|s. Thus s = ait, 
and 
ab 


m= tajbid = t-- & 


Because of the properties listed in Theorem 2-7, the number (a, b) is 
called the, least common multiple (LCM) of a and b. The definition is easily 
extended to the case of more than two numbers, just as for the GCD. 
It is useful to remember that 


= (a, b)(a, b). 


PROBLEMS 


1. In the notation of Problem 2, Section 2-3, show that 
(a,b) = TL rm, 
i=1 


where max (a, §) is the larger of a and 8. Use this to give a second proof of part 
(3) of Theorem 2-7. 
2. Show that 


min (a, max (6,7)) = max (min (a, 8), min (a, 7)). 
(By symmetry, one may suppose that 8B > Y.) Deduce that 
(a, {b, c)) = (a, 6), (a, ¢)). 


CHAPTER 3 
CONGRUENCES 


3-1 Introduction. The problem of solving the Diophantine equation 
az + by = c is that of finding an z such that az and c leave the same re- 
mainder when divided by }, since then b|(¢ — ax), and we can take y = 
(c — az)/b. As we shall see, there are also many other instances in which 
a comparison must be made of the remainders after dividing each of two 
numbers a and b by a third, say m. Of course, if the remainders are the 
same, then m|(a — b), and conversely, and this might seem to be an ade- 
quate notation. But as Gauss noticed, for most purposes the following 
notation is more suggestive: if m|(a — b), then we write a = b (mod m), 
and say that a is congruent to b modulo m. (This has nothing to do with 
geometric congruence, of course.) 

The use of the symbol “=” is suggested by the similarity of the relation 
we are discussing to ordinary equality. Each of these two relations is an 
example of an equivalence relation, i.e., of a relation ® between elements 
of a set, such that if @ and b are arbitrary elements, either a stands in the 
relation ® to b (more briefly, a®b) or it does not, and which furthermore 
has the following properties: 


(a) aRa. 
(b) If amb, then bRa. 
(c) If a&b and b&e, then aRe. 


These are called the reflexive, symmetric, and transitive properties, 
respectively. That equality between numbers is an equivalence relation 
is obvious (or it may be taken as an axiom): either a = b or a # 6; 

= a;ifa = b, thend = a;ifa = bandb=c, thena=c. 


THEorEM 3-1. Congruence modulo a fixed number m is an equivalence 
relation. 


Proof: 


(a) m|(a — a), so that a = a (mod m). 

(b) If m|(a — 5b), then m|(b — a); thus if @ = b (mod m), then b=a 
(mod m). 

(c) If ml(a — 6) and m|(b — c), then a — b = km, b — ¢ = Im, say, 
so that a ~ c = (k + lm; thus if a = b (mod m) and 6b =c (mod m), 
then a =c(mod™). A 

36 
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Since the student will have occasion later to use other equivalence rela- 
tions, we pause to show a simple but important property common to all 
such relations. If ® is an equivalence relation with respect to a set S, then 
corresponding to each element a of S there is a subset S, of S which con- 
sists of exactly those elements of S which are equivalent to a, so that b 
is in S, if and only if aRb. Now if a@b, then the sets S, and S; are identical: 
if cis in Sp, then cb, and since abd, also cRa, so that c isin Sg. If, on the 
other hand, a is not equivalent to b, then Sz and S, are disjoint; that is, 
they have no element in common. For if c is in Sg and in Sp, then cRa 
and cb, which entails a@b. These disjoint sets S,, which together make 
up S, are called equivalence classes; an element of an equivalence class is 
sometimes called a representative of the class, and a complete system of 
representatives is any subset of S which contains exactly one element from 
each equivalence class. 

Section 3-3 provides examples of all these notions, with somewhat 
different terminology. 


PROBLEMS 


1. Decide whether each of the following is an equivalence relation. If it is, 
describe the equivalence classes. 

(a) Congruence of triangles. 

(b) Similarity of triangles. 

(c) The relations “+”, “>”, and “>”, relating real numbers. 

(d) Parallelism of lines. 

(e) Having the same mother. 

(f) Having a parent in common. 

2. Define the relation ® by a®b if and only if alb. Show that @ is reflexive 
and transitive, but not symmetric. Find other mathematically defined relations 
to show that any one or two of the properties of reflexivity, symmetry, and 
transitivity may hold without the others. 

3. Show that if a = 6 (mod m) and d|m, then a = b (mod d). 


3-2 Elementary properties of congruences. One reason for the su- 
periority of the congruence notation is that congruences can be combined 
in much the same way as can equations. 


TuEorEeM 3-2. If a =b(modm) and c=d(modm), then a+c= 
b + d (mod m), ac = bd (mod m), and ka = kb (mod m) for every integer 
k. 


Proof: These statements follow immediately from the definition. For 
if a = b (mod m), then m|(a — b) and, similarly, m|(c — d), and therefore 
m|(a — b-+c¢—d), or m\((a+c) — (6+d)). But this means that 
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a+c=b-+d(modm). Secondly, if m|(a — b) and m|(ce — d), then 
m\(a — b)(c — d). But 


(a — b)(e — d) =(ae — bd) + bd — ce) + db — a), 


and since m divides the second and third terms on the right-hand side, 
also m|(ac — bd). Finally, if m|(a — 6), then also m|k(a — b) for every k. aA 

The situation is a little more complicated when we consider dividing 
both sides of a congruence by an integer. We cannot deduce from ka = kb 
(mod m) that a = b (mod m), for it may be that part of the divisibility of 
ka — kb = k(a — b) by m is accounted for by the presence of the factor 
k. What is clearly necessary is that the part of m which does not divide k 
should divide a — b. 


THEoREM 3-3. If ka = kb (mod m) and (k, m) = d, then 
m 
a=bmod (7) : 


TuroreM 3-4. If f(z) is a polynomial with integral coefficients, and 
a = b (mod m), then f(a) = f(b) (mod m). 


Proof: Let f(z) = eo + cyt + +++ + c,2". Ifa =b (mod m), then for 
every nonnegative integer j, 


Proof: Theorem 2-3. A 


a’ = b’ (mod m), 
and 
c;a’ = c;b’ (mod m), 
by Theorem 3-2. Adding these last congruences for j = 0,1,...,7, we 


have the theorem. A 


Theorem 3-4 is basic to much of what follows in this chapter. As a 
first very simple application of it, let us consider the well-known rule 
that a number is divisible by 9 if and only if the sum of the digits in its 
decimal expansion is divisible by 9. If for example n = 3,574,856, then 
3+5+7+4+8+4+5+46 = 38, and since 38 is not divisible by 9, 
neither is n. Here 


n= 3-10°+5-10°+7-104+ 4-10? + 8-107 + 5-104 6, 
so that n = f(10), where 
f(x) = 3° + 52° + 724 + 47° + 82? + 5a + 6. 
On the other hand, f(1) is exactly the sum of the digits: 
f) =34+5474+448454+6. 
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Since 10 = 1 (mod 9), it follows from Theorem 3-3 that also f(10) = 
f(1) (mod 9), and this implies in particular that either f(10) and f(1) both 
are divisible by 9 or neither is. 

The same argument applies in general. The decimal representation of n 
is always the expression of 7 as the value of a certain polynomial f(x) for 
x = 10, and invariably f(10) = f(1) (mod 9). We see in fact that the rule 
can be strengthened in the following way: if n = f(10) and m = g(10), 
then 

n- m = f(10) + g(10y = f(1) + g(1) (mod 9), 
f(10) - g(10) = f(1) - g(1) (mod 9); 


hence, if n + m = F(10) andn-m = G(10), then 


F(10) = F(1) = f(1) + g(1) (mod 9), 
G(10) = G1) = f(1) - g(1) (mod 9). 


I 


nmrm 


In words, these last two congruences say the following: The sum of the 
digits in n + m ts congruent (mod 9) to the sum of all the digits in n and m, 
and the sum of the digits in n-m is congruent (mod 9) to the product of the 
sum of the digits in n and the sum of the digits in m. This statement provides 
a partial check on the correctness of arithmetical operations, called 
“casting out nines,” which amounts simply to verifying that the italicized 
assertion holds in particular cases. If, for example, we computed 47 -+ 94 
as 131, we could recognize the existence of an error by noting that 
(4+ 7) + (9+ 4) = 24 = 6 (mod 9), whereas 1 + 3+ 1 = 5 (mod 9). 
Similarly, it cannot be that 47-19 = 793, since (4-++7)(1 +9) = 
110 =1+1+0 = 2 (mod 9), while 7 + 9+ 3 = 19 =1 (mod 9). On 
the other hand, it is also true that 47-19 # 884, even though 8 -+ 8 ++ 
4 = 2 (mod 9); hence this method does not afford an absolute check on 
accuracy. 


PROBLEMS 
1. Let 
f(z) = aox™ 4- ayez"“1 +++ + an, 
where ao,..., @n are integers. Show that if d consecutive values of f (i.e., values 


for consecutive integers) are all divisible by the integer d, then d/f(x) for all 
integral x. Show by an example that this sometimes happens with d > 1 even 
when (a0,...,@n) = 1. 

2. Using the fact that 10 = —1 (mod 11), devise a test for divisibility of an 
integer by 11, in terms of properties of its digits. 

3. Use the fact that 7-11-13 = 1001 to obtain a test for divisibility by 
any of the integers 7, 11, or 13. 
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4, Without carrying out the computations, test the accuracy of the following 
equations: 
(a) 1097 X.8156 = 8947132, (b) 283 + 373 = 73605. 
5. Show that no square has a decimal expansion ending in 79. More generally, 
find all possible two-digit endings for squares. 
6. Show that every square is congruent to 0 or 1 (mod 8). Deduce that no 
integer of the form 8% +- 7 is the sum of the squares of three integers. 
7. Show that for every 2, 2? = x (mod 8), and that 25 = 2 (mod 5). Formu- 
late a general conjecture, and test it in some other cases. 
8. Show that every quadratic discriminant 6? — 4ac is congruent to 0 or 1 
(mod 4). 
9. Show that if (z, 6) = 1, then z? = 1 (mod 24). 
10. Show that if a = 6 (mod m), then (a, m) = (b, m). 


3-3 Residue classes and arithmetic (mod m). When dealing with con- 
gruences modulo a fixed integer m, the set of all integers breaks down into 
m classes, called the residue classes (mod m), such that any two elements 
of the same class are congruent and two elements from two different classes 
are incongruent. The residue classes are also called arithmetic progressions 
with difference m. For many purposes it is completely immaterial which 
element of one of these residue classes is used; for example, Theorem 3-4 
shows this to be the case when one considers the values modulo m of a 
polynomial with integral coefficients. In these instances it suffices to 
consider an arbitrary set of representatives of the various residue classes; 
that is, a set consisting of one element of each residue class. Such a set 
4, @a,...,@m, Called a complete residue system modulo m, is characterized 
by the following properties. 


(a) If t ¥ 7, then a; # a; (mod m). 
(b) If a@ is any integer, there is an index 7 with 1 < 7 < m for which 
a = (mod m). 


Examples of complete residue systems (mod m) are the set of integers 
0, 1, 2,...,m — 1 and the set 1, 2,...,m. The elements of a complete 
residue system need not be consecutive integers, however; for m = 5 we 
could take 1, 22, 13, —6, 2500, for example. More generally, if we write 
out the five arithmetic progressions with difference 5: 


210, =5, 0, 5, 10, 15). 6%, 
ye Oth Ge Ty 1G, can, 
gs a BOA: 
Nn ee ee eae 
y 8p S40 19s, 
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we could choose any one element from each row, that from the first row 
being representative of all the integers divisible by 5, that from the second 
row being representative of all the integers of the form 5n + 1, that from 
the third row being representative of all the integers of the form 5n + 2, 
etc. 


TuerorEM 3-5. If a1, a2,..., Gm is a complete residue system (mod m) 
and (k, m) = 1, then ka,, kag,..., kam also is a complete residue sys- 
tem (mod m). 


Proof: We show directly that properties (a) and (b) above hold for this 
new set. 


(a) If ka; = ka; (mod m), then by Theorem 3-3, a; = a; (mod m), 
whence 1 = j. 

(b) Theorem 2-6 shows that if (k,m) = 1, the congruence kr = 
a(mod m) has a solution for any fixed a. Let a solution be 2p. 
Since @1,...,@m is a complete residue system, there is an index 7 
such that t9 = a; (mod m). Hence krp = ka; = a (mod m). A 


When we restrict ourselves to a particular residue system (mod m), say 
0,1,...,m — 1, we obtain the “arithmetic (mod m)” if we work out the 
addition and multiplication tables for these m numbers. If, for example, 
we take m = 5, we obtain the following tables: 


TaBLE 3-1 
+1012 3 4 x10 1234 
| 
olo 123 4 0100000 
1/123 4 0 1/0123 4 
@ ogle3401 © glo241 3 
3/3 4012 3/0 314 2 
4/4012 3 4/0 4321 


In the first table, the entry in the row beginning with r and the column 
beginning with s is the sum 7 + s, in the sense that it is the representa- 
tive of the residue class (mod 5) containing that sum. Thus 3+ 3=6=1 
(mod 5), and this is the 1 in the next-to-last row and column of the first 
table. On the other hand, 3-3 = 9 = 4 (mod 5), and 4 is the entry in 
the next-to-last row and column of the second table. This “modular” 
multiplication is perhaps new to the student, but “modular” addition is 
familiar to everyone through our systems of keeping time. When we say, 
“It is now seven o’clock; in 8 hours it will be three o’clock,” we are simply 
adding modulo 12: 7 + 8 = 15 = 3 (mod 12). Similarly, the statement, 


42 CONGRUENCES {cHaP. 3 


a 


“Five days from next Thursday will be a Tuesday,” entails addition 
(mod 7). 

In the special case m = 5 it is possible to perform not only addition 
and multiplication but also subtraction and division, except for division 
by zero. In general, to subtract a from b means “find z such that a + # 
is b.” In ordinary arithmetic the word “is” in the quoted sentence means 
“is equal to,” whereas in arithmetic (mod m) it must be taken to mean 
“is congruent to, modulo m.” With this meaning we can verify that sub- 
traction (mod m) is always possible by noting that in the addition table 
(Table 3-1a), each row in the body of the table contains all of the num- 
bers 0, 1, 2, 3, 4, and each just once. To subtract 3 from 2 or to find what 
must be added to 3 to yield 2, we look along the row headed 3 until we 
encounter the 2, and obtain the number at the head of the column con- 
taining it, namely 4, as the difference: 2 — 3 = 4 (mod 5). Division is 
carried out in the same way in Table 3-1(b); being able to do so depends 
on the fact that, excluding the first row and column in the body of the 
table, each of the numbers 1, 2, 3, 4 occurs exactly once in each row. 
Here we have interpreted the division of b by a as the finding of an x 
such that b = ax (mod 5). 

With respect to division, a composite modulus is somewhat less satis- 
factory, because the fundamental principle is no longer valid that a product 
is zero only if one of the factors is zero. For example, 2-3 = 0 (mod 6), 
even though neither 2 nor 3 is 0 (mod 6). This situation is reflected in 
the fact that division is not always possible, since, for example, there is 
no sense to be attached to the symbol 1/2 (mod 6) because there is no 
x for which 2x = 1 (mod 6). We shall return to this question in Section 3-5. 


PROBLEMS 


1. Let m > 1 be fixed. Show that if the integers a1, az2,..., a, have any 
two of the following three properties, they also have the third, and hence consti- 
tute a complete residue system (mod m): 


(a) If i # j, then a; # a; (mod m); 

(b) if a is any integer, there is an index 7 with 1 << i< k for which 
a@ = a, (mod m); 

(c) k = m. 


Prove Theorem 3-5 by verifying (a) and (c), rather than (a) and (b) as is done 
in the text. 


2, Prove a theorem similar to Theorem 3-5, concerning the numbers ka, + l, 
kag + 1,..., kam -+ 1, in which lis any fixed integer. 


3-4 Reduced residue systems and Euler’s g-function. The reason that 
we use the adjective “complete” when speaking of a residue system is 
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that there is another kind which is also frequently useful, called a re- 
duced residue system. This is a set of integers ai,..., a@,, incongruent 
(mod m) and relatively prime to m, such that if a is any integer prime to 
m, there is an index 7, 1 < 7 < h, for which a = a; (mod m). In other 
words, a reduced residue system is a set of representatives, one from each 
of the residue classes containing integers prime to m. [Clearly, (a, m) = 
(b, m) if a=b(modm). For if a and b are congruent (mod m), then 
m\(a — b), and since (a, m)|m, we have (a, m)|(a@ — 6). It follows that 
(a, m)|b, and consequently that (a, m)|(b,m). By similar reasoning, 
(b, m)|(a, m), and therefore (a, m) = (b, m).| For example, 1 and 5 con- 
stitute a reduced residue system (mod 6), and 1, 2, 3, 4, 5, 6 a reduced 
residue system (mod 7). In the case of prime modulus p, a reduced residue 
system results from a complete residue system by omission of the single 
number divisible by p. 

The number h of elements in a reduced residue system (mod m) is the 
number of positive integers not exceeding m and prime to m. This quantity, 
which depends on m, is customarily designated by y(m), and is called 
Euler’s g-function, after the Swiss mathematician Leonard Euler. It 
might be mentioned that for m > 1, g(m) can also be defined as the 
number of positive integers less than m and prime to m, since for such m, 
(m,m) > 1. For m = 1, however, the two definitions give different 
values. 


TurorEM 3-6. If a;,..., Ggcm) is a reduced residue system (mod m) 
and (k,m) = 1, then also kay,..., kayym) is a reduced residue system 
(mod m). 


The proof is exactly parallel to that of Theorem 3-5. 
Table 3-2 lists the first few values of ¢(m): 


TABLE 3-2 

m |go(m) |) m | o(m) |} m | elm) 
1 1 il 10 21 12 
2 1 12 4 22 10 
3 2 13 12 23 22 
4 2 14 6 24 8 
5 4 15 8 25 20 
6 2 16 8 26 12 
7 6 17 16 27 18 
8 4 18 6 28 12 
9 6 19 18 29 28 

10 4 20 8 30 8 
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One immediately notices that for m > 2, the values of (m) are even. 
This is always the case, since if a is one of the integers counted in ¢(m), 
that is, one of the integers not larger than m and prime to m, then m — a 
is another such integer [for clearly (a,m) = (m — a,m)]. The two 
integers a and m — a are distinct, since a = m — a gives m = 2a, 
which is inconsistent with the assumption that (m, a) = 1, unless a = 1, 
m = 2. Hence, for m > 2, the integers counted in ¢(m) can be paired 
off, and so the number of them must be even. 

Aside from the evenness of ¢(m), and the fact that o(p) = p — lif 
p is a prime, the values of the ¢g-function seem to be highly irregular. 
However, as we shall soon see, it is possible to compute the value of 
¢(m) very quickly if the prime factorization of m is known. The ¢-function 
has many interesting properties, and it occurs repeatedly in number- 
theoretic investigations. 

One feature to note from the above table is that in certain cases at least, 
the values g(m) and ¢(n) can be multiplied together to give y(mn); for 
example, (3)¢(7) = (21), and ¢(4)¢(5) = ¢(20). On the other hand, 
¢(4)(6) # ¢(24). The correct rule is as follows: 


TuHEorEM 3-7. If (m,n) = 1, then g(mn) = o(m)¢(n). 


(A function with this property is called a multiplicative function. For 
another example, see Problem 10, Section 2-2.) 


Proof: Take integers m, n with (m,n) = 1, and consider the numbers 
of the form mz + ny. If we can so restrict the values which + and y 
assume that these numbers form a reduced residue system (mod mn), 
there must be ¢(mn) of them. But their number is also the product of 
the number of values which z assumes and the number of values which 
y assumes. Clearly, in order for mz + ny to be prime to m, it is necessary 
that (m, y) = 1, and likewise we must have (n, x) = 1. Conversely, if 
these last two conditions are satisfied, then (mz + ny, mn) = 1, since 
in this case any prime divisor of m, or of n, divides exactly one of the 
two terms in mz + ny. Hence let x range over a reduced residue system 
(mod 7), say 21,...,2%ny, and let y run over a reduced residue system 
(mod m), say ¥1,---, Yeim. If for some indices 7, j, k, | we have 


mz; + ny; = mx, + ny: (mod mn), 
then 
m(x; — te) + n(yz — yr) = 0 (mod mn). 
Since divisibility by mn implies divisibility by m, we have 
m(xi — te) + ny; — yr) = 0 (mod m), 
nly; — yi) = 0 (mod m), 
yj = y: (mod m), 
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whence j = l. Similarly, i = &. Thus the numbers mz + ny so formed 
are incongruent (mod mn). Now let a be any integer prime to mn; in 
particular, (a,m) = 1 and (a,n) = 1. Then Theorem 2-6 shows that 
there are integers X, Y (not necessarily in the chosen reduced residue 
systems) such that mX + nY = a, whence alsomX + nY = a (mod mn). 
Since (m, Y) = (n, X) = 1, there is an 2x; such that X = 2x; (mod n), 
and there is a y; such that Y = y; (mod m). This means that there are 
integers k, 1 such that X = x; -+ kn, Y = y; + lm. Therefore 


mX + nY = m(a; + kn) + n(y; + lm) = ma; + ny; = a (mod mn). 


Hence, as x and y run over fixed reduced residue systems (mod n) and 
(mod m), respectively, mz -+ ny runs over a reduced residue system 
(mod mn), and the proof is complete. A 


THEOREM 3-8. 


where the notation indicates a product over all the distinct primes which 
divide m. 


Proof: By Theorem 3-7, if 
m = TI pi, 
i=l 


then 
o(m) = JI o(p?). 
i=1 
But we can easily evaluate ¢(p*) directly: all the positive integers not 


exceeding p* are prime to p* except the multiples of p, and there are just 
p*—' of these, so that 


a; a, as a; 1 
epi) = pit — pit = pt(1 — 1). 


Pi 
Thus 
Tr re Tr we Tr. 1 
g(m) = er(s a “ = [I ( = 1) 

t= Bi i=l i=1 Di 

1 

= mI me 1) A 
p\m P. 


For example, the four integers 1, 5, 7, 11 are all those which do not 
exceed 12 and are prime to 12, and 


g(12) = 1201 — 3) — 3) = 4. 
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In Theorem 3-8 we have an example of the II-symbol used for a product 
in which the variable index does not run over all the integers up to a 
certain one, but over the integers satisfying certain conditions. Whenever 
the range of summation or multiplication consists of anything more com- 
plicated than all the integers of a certain interval, the description of the 
range is written entirely below the >> or II. Further examples occur in 
the proof of the next theorem. Perhaps it should also be mentioned that 
the symbol }°1 means to add as many 1’s as there are integers satisfying 


the conditions occurring below the >¢; in other words, it is the number of 
integers satisfying these conditions. 


> o(@) =n. 


THEOREM 3-9. 


din 
Proof: Let d,,...,dz be the positive divisors of n. We separate the 
integers between 1 and n inclusive into classes C(d,),...,C(d,), putting 


an integer into the class C(d;) if its GCD with n is d;. The number of 
elements in C(d;) is then 
Dea 


aSn 
(a,n)=dy 


and since every integer up to 7 is in exactly one of the classes, we have 


‘Se Oe hooe 


dyin aSn 
(a,.n)=dy 


The number of integers a such that 1 < a < nand (a,n) = d; is exactly 
equal to the number of integers b such that 1 < b < n/d; and (b, n/d;) = 1; 
in fact, multiplying the b’s by d;, we obtain the a’s. But from the definition 
of the Euler function, the number of b’s is clearly ¢(n/d;). Thus 


n — 
2a) = 


which is equivalent to the theorem, since, as d; runs over the divisors of 
n, n/d; also runs over these divisors, but in reverse order. & 


To illustrate the theorem and its proof, take n = 12. Then 
e(1) + (2) + ¢(8) + ¢(4) - o(6) + (12) = 14+142424244=12 
CQ) = {1,5,7,11}, CQ) = {2,10},  C(3) = {3,9}, 
C(4) = (4, 8}, C(6) = {6}, C12) = {12}. 
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PROBLEMS 


1. Prove with the help of Theorem 3-8 that if (a,b) = d, then 


2. Show that if n > 1, then the sum of the positive integers less than n and 
prime to it is 
n(n) 
2 


[Hint: If m satisfies the conditions, so does n — m.] 
3. Show that if dln, then o(d)|y9(n). 
4. Let n be positive. Show that any solution of the equation 


g(z) = 4n+ 2 


is of one of the forms p* or 2p%, where p is a prime of the form 4s — 1. Deduce 
that there is no solution of the equation g(x) = 14. [Hint: Use the factorization 
of g(x) as given in Theorem 3-8.] 

*5, Let f(x) be a polynomial with integral coefficients, and let ¥(n) denote the 
number of values 

f(0), fF), ---. f(n — 1) 
which are prime to n. 
(a) Show that y is multiplicative: 


(mn) = ¥(m)-Y(n) if (m,n) = 1. 


(b) Show that 
¥(p%) = p*(p — by), 


where b, is the number of integers f(0), f(1),...,f(@ — 1) which are divisible 
by the prime p. 


6. How many fractions r/s are there satisfying the conditions 
(r,s) = 1, O<r<s<n? 


7. (a) Use Theorem 3-6 to show that if pla, then the congruence ax = b 
(mod p) is solvable. (Consider separately the cases plb and ptb.) (b) What 
connection does (a) have with the multiplication table (mod 5) of the preceding 
section? 

8. It follows from Theorem 3-4 that if a = b (mod m), then a* = b* (mod m). 
Is it always true that if u =v (mod m), then a“ =a"(modm)? Construct 
tables of the smallest positive residues of a, a”, a3,... (mod m) for m = 7, 12, 
18, with a running over a complete or reduced residue system (mod m). Make 
conjectures concerning the periodicity, length of period, etc., of the powers of 
a fixed number (mod m). 
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*9, A complex number £ is said to be an nth root of unity if &* = 1, and a 
primitive nth root of unity if, in addition, &" ~ lfor0 <m <n. 

(a) Show that the powers of £, a primitive nth root of unity, form a periodic 
sequence, of period n. 

(b) If (m,n) = d, show that y = & is an (n/d)th root of unity. 

(c) Show that in fact 7 is a primitive (n/d)th root of unity. [Hint: Suppose 
that n is a primitive rth root, and apply Theorem 1-1.] 

(d) Supposing that there is at least one, how many primitive nth roots of 
unity are there? (Remember that the equation x” — 1 = 0 has only n complex 
roots.) 

*10. Show that for each n the equation g(r) = n has only finitely many solu- 
tions. [Hint: Show that y(p%) can be made arbitrarily large by making p* 
sufficiently large.] 


3-5 Linear congruences. Because of the analogy between congruences 
and equations, it is natural to ask about the solution of congruences in- 
volving one or more (integral) unknowns. In the case of an algebraic con- 
gruence f(x) = 0 (mod m), where f(x) is a polynomial in x with integral 
coefficients, we see by Theorem 3-4 that if 2 = a is a solution, so is every 
element of the residue class containing a. For this reason it is customary, 
for such congruences, to list only the solutions between 0 and m — 1, 
inclusive, with the understanding that any x congruent to one of those 
listed is also a solution. Similarly, when the number of roots of a certain 
congruence is mentioned, it is actually the number of residue classes that 
is meant. Attention must be given, however, to the modulus with respect 
to which the solutions are counted since, for example, the arithmetic 
progression... , —3, —1, 1,3,5,... constitutes a single residue class 
(mod 2), but two residue classes (mod 4): the elements ..., —3, 1, 5,... 
make up the class of integers =1 (mod 4), and the remaining ones the 
class of integers =3 (mod 4). 

When we list residue classes as solutions of a congruence, we are, in 
effect, doing exactly the same thing as when we solve an equation. The 
solution of the equation 52 = 3 is given by x = 3/5; in other words, the 
solution of an equation is described by another equation, in which x 
occurs as one side of the equation and the other does not involve z at all. 
Similarly, a solution of the congruence 5z = 3 (mod 7) is given by 

= 2 (mod 7); thus the solution of a congruence is described by another 
congruence, where x appears alone on one side and the other side is free 
of x. 

With these remarks in mind, let us proceed to the details. The simplest 
case to treat is the linear congruence in one unknown; that is, the con- 
gruence 

ax = b(mod m). 
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As we have already noticed, this is equivalent to the linear Diophantine 
equation 
ax — my = b. 


By Theorem 2-6 this equation can be solved if and only if (a, m)|b, and 
if it is solvable and if zo, yo is a solution, then a general solution is 


X= Xo (mod a)” Y= Yo (mod 5) ’ 


where d = (a,m). In particular, x is unique modulo m/d. Among the 
numbers x satisfying the first of these congruences, the numbers 


2 


‘d — 1 
mye + Ea Dm 


d 


m 
Lo, Lo + to + 


are incongruent (mod m), whereas all other such numbers z are congruent 
(mod m) to one of these. Hence we have the following theorem: 


TaErorEM 3-10. A necessary and sufficient condition that the con- 
gruence 
ax = b (mod m) 


be solvable is that (a, m)|b. If this is the case, there are exactly (a, m) 
solutions (mod m). 


While Theorem 3-10 asserts the existence of a solution under appropriate 
circumstances and predicts the number of such solutions, it says nothing 
about the process of finding them. If no solution can be found by inspec- 
tion, then the simplest procedure is to convert the congruence to an 
equation and solve by the method given at the beginning of Section 2-4. 

Consider, for example, the congruence 


342 = 60 (mod 98). 
Since (34, 98) = 2 and 2/60, there are just two solutions, to be found from 
17x = 30 (mod 49). 


This is equivalent to 17x — 49y = 30, and we obtain 


_ 49y + 30 _ _ gy +4 
Ore ag en ale i7 


17i — 4 
2 
£ = 2z. 


, t= 


= 8 2455 gs 
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Take z = 0; thent = 0, y = —2,2 = —4. Hence 
x = —4 (mod 49), 
and the two solutions of the original congruence are 
x = —4, 45 (mod 98). 


Theorem 3-10 provides the answer to a question alluded to earlier, 
namely, When is division possible in arithmetic (mod m)? We see now 
that an integer a has a reciprocal (mod m)—that is, the congruence 
ax = 1 (mod m) is solvable—if and only if (a,m) = 1, and correspond- 
ingly that the “reduced fraction” b/a makes sense modulo m if and only 
if (a,m) = 1. If mis a prime p, then division by any number not in the 
residue class of 0 is possible, but for composite modulus this is not the 
case. Thus 

2 = 6 (mod 7) since 3 = 4-6 (mod 7), 
and 
3 = 3 (mod 6) since 3 = 3-5 (mod 6), 


while # has no meaning (mod 6), since (4, 6)+3 and there is no solution 
of 3 = 4x (mod 6). 

Actually, it is customary not to use the fractional notation, but to refer 
instead to the solution of a linear congruence. This is rather like referring 
to “the solution of the equation 42 = 3” instead of “the rational number 
#”; neither is logically superior to the other. In the modular case, however, 
all of the infinitely many “fractions” a/b which make sense (mod m) 
are congruent to one or another of the elements of the finite set 0, 1,..., 
m — 1, and there are obvious advantages in dealing with the smaller set. 

The solution of a linear congruence in more than one unknown can be 
effected by the successive solution of a (usually large) number of congru- 
ences in a single unknown. Consider the congruence 


1X1 + Gg%q + +++ + Gntn = c (mod m). 


The obviously necessary condition for solvability, that (a1,..., Gn, m) 
should divide c, is also sufficient, just as in the former case. For, assuming 


it satisfied, we can divide through by (a;,..., dn, m) to get 

ax; +-+-+ ahr, = c’ (mod m’), 
where now (aj,...,a,,m’) = 1. If (aj,...,a,-1, m’) = d’, we must 
have 


Qntn = ce’ (mod a’); 


since (a,, d’) = 1, this congruence has just one solution (mod d’), and 


3-5] LINEAR CONGRUENCES 51 


m’/d’ solutions (mod m’). Substituting each of these in turn yields m’/d’ 
congruences in n — 1 unknowns, and the process can be repeated. 
As an example, consider the congruence 


2x + 7y = 5 (mod 12). 
Here (2, 7, 12) = 1. Since (2,12) = 2, we must have 
7y = 5 (mod 2), 


which clearly gives y =1(mod2), or y = 1,3, 5,7, 9, 11 (mod 12). 
These yield 
2x = 10, 8, 6, 4, 2, 0 (mod 12), 
respectively, or 
x = 5,4, 3, 2, 1, 0 (mod 6). 


Thus the solutions (mod 12) are 
x,y = 5,15; 11, 154, 3; 10, 3; 3, 5; 9, 5; 2, 7; 8, 7; 1, 9; 7, 9; 0, 11; 6, 11. 


The general situation is described in the following theorem, which is 
easily proved by induction on the number of unknowns. 


THEOREM 3-11. The congruence 
Q12) + +++ + GnZ, = c (mod m) 


has just dm”—! or no solutions (mod m), depending on whether dle 
or dte, where d = (a1,..., Gn, m). 


Turning now to the simultaneous solution of a system of linear con- 
gruences, we consider the system 


ox = 6; (mod m,),..., ant = Bn (mod mp), 
a; and 8; integers. 
Clearly, no x satisfies all these congruences unless each can be solved 


separately. Assuming that this is so, we can suppose that each has already 
been solved for zx, so that we have one or more systems of the form 


x = cy (mod m),..., 2 = Cn (mod m,). 


It is obvious that such a system of n congruences will have no solution 
unless every pair taken from among them has a solution. From the first 
of the congruences 


x = c; (mod m), 2x = e;(mod m,), 
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we get x = c; + miy; substituting in the second yields 
my = c; — ¢; (mod m,), 
and consequently it must be true that 
(mi, mj)\(c; — ¢;). 


It can be shown that this condition, which is necessary for solvability, 
is also sufficient: ¢f (mi, m;)|(c: — ¢;) for every two indices i and j be- 
tween 1 and n inclusive, then the system x = c; (mod m,), 7 = 1,...,n, 
ts solvable, and the solution is unique modulo the LCM of m,..., mn. 
We shall not carry through the proof of this general statement, but con- 
tent ourselves with the following special, but extremely important, case: 


THEOREM 3-12 (Chinese Remainder Theorem). Every system of linear 
congruences 2 = c; (mod m)),...,2% = ¢, (mod mn), in which the 
moduli are relatively prime in pairs, is solvable, and the solution is 
unique modulo the product of the moduli. 


Proof: The theorem is trivially true if there is only one congruence 
in the system. Suppose that it is true for every system containing 
fewer than n congruences, and consider the system x = c; (mod m,), 
4=1,...,n, in which (m;,m;) = 1 for 1 <i <7 <n. Then by the 
induction hypothesis we can solve the last n — 1 of the congruences simul- 
taneously, and obtain in their place a single congruence with modulus 
M2°*++Mn. Put me-++m, = M. Then the system of n congruences is 
equivalent to the simpler system 


x = ¢; (mod m)), x =C (mod &), 
for suitable C. Repeating the reasoning used above, we find 
L= ei + my, 
¢, + my =C (mod M), 
my =C — ¢, (mod M), 


and since (m,,M) = 1, this last congruence has a unique solution 
(mod M) by Theorem 3-10. If the solution is y = C’ (mod M), then we 
have 

y=C'+ Mz, 


x= ec, + m(C’ + Mz) = (ec, + mC’) + m Mz, 


and so x is unique modulo m;M = m,---m, A 
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Consider for example the system: 


x = 1 (mod 8), 
x = 5 (mod 8), 
ax = 11 (mod 17). 


From the first congruence, x = 3¢-+ 1, and from the second, 


3t = 4 (mod 8), 
so that 
t = 4 (mod 8), 
t= 8u+ 4, 
v= 24u 4+ 18. 


Using the third of the original congruences, we obtain 


24u = —2 (mod 17), 
12u.-= —1 (mod 17), 
u = 7 (mod 17), 
u=17v+ 7, 
= 24-17) + 7-24 + 13, 
x = 181 (mod 3- 8-17). 


PROBLEMS 
1. Find the solutions of each of the following congruences, and check your 
conclusion against that of Theorem 3-10: 

(a) 332 = 21 (mod 105), 

(b) 332 = 22 (mod 105), 

(c) 152 = 30 (mod 105). 

2. Solve the congruence 6z + 15y = 9 (mod 18). 

3. Solve simultaneously: 


x = 1 (mod 2), 
x = 2 (mod 3), 
x = 3 (mod 5), 
x = 5 (mod 7). 


4. Suppose that the system of congruences, 


2 = a1 (mod m)), 
(#) 


Weve 


x = a, (mod m,), 


54 CONGRUENCES [cuap. 3 


is to be solved, where mymzg--- ma = M and (m;, m,;) = 1 for all i and j with 
t #3. For simplicity we write z= {a1,..., an} (mod {mi,..., ma]}) as an 
abbreviation for (*). In the following, we present a method for writing the 
solution of (*) as a combination of solutions of simpler systems. 


(a) Solve the system x = {0,0,..., 0} (mod {mg, ..., mn}) and thus replace 
the system + = {1,0,...,0} (mod {m1, me,..., ma}) by a system of two con- 
gruences. Replace these two congruences in turn by a single congruence 
(mod m1), with a new unknown. 

(b) Proceeding as in part (a), replace each of the following systems by a single 
congruence: 


2, = {1,0,0,...,0} (mod {mi,..., ma}); 
xq = {0,1,0,...,0} (mod {m,..., mn}); 


Z, = {0,0,...,0,1} (mod {m,..., mn}). 


(c) Let the separate congruences obtained in part (b) have solutions 
x1 = e1 (mod M),..., tn = en (mod M), respectively. Show that 


M M 
2 = aye a + d2€2 a Aaa ~+ Onén =. (mod M) 


is the solution of the original system (*). Note that once the numbers ¢1,..., €n 
have been computed, the additional work required by a change in the con- 
stants a1, ..., a, is almost nil. 


5. Apply the procedure of Problem 4 to solve the following systems: 


x = 10 (mod 27), = 8 (mod 27), 
(a) 2 = 2 (mod 25), (b) x = 21 (mod 25), 
x= 3 (mod 8); = l1(mod 8). 
6. Solve the following systems: 
x= 1 (mod 8), x= 5 (mod 8), x= 4(mod 8), 
(a) (b) (ec) 
x = 43 (mod 81), x = 73 (mod 81), x = 19 (mod 81). 


7. A famous theorem of P. L. Dirichlet asserts that if k and I are relatively 
prime, then there are infinitely many primes of the form kx + 1. The proof is 
rather difficult. Prove the much weaker statement that if (k,l) = landn # 0, 
there is an x such that (kr + 1,n) = 1. [Hint: It must be shown that x can be 
chosen such that every prime p dividing n does not divide kx +1. Treat sepa- 
rately the cases pjl and pHl.] 


8. Show that Dirichlet’s theorem implies, and is implied by, the following 
assertion: if (k, 1) = 1, then there is at least one prime of the form ka + 1. 
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3-6 Polynomial congruences. As is well known, the equality sign is 
used between polynomials in two essentially different ways. In the 
equation 


(a + a)? = 2? + 2ax + a’, 


for example, it means that the left- and right-hand sides are identical 
polynomials, i.e., that the coefficients of x”, and of x, are equal, as are the 
constant terms. In the equation xz? — 2 = 0, it means that the square 
of the number z is equal to 2, and this may be true or false for particular 
x. If we temporarily refer to the first as algebraic equality, and to the sec- 
ond as numerical equality, there is the following connection between them: 
if two polynomials are algebraically equal, they are also numerically equal 
for every value of x, and if two polynomials are numerically equal for 
every value of x (or even for infinitely many values of x), then they are 
algebraically equal. 

The congruence symbol is also used in two different ways, to relate 
polynomials. When f(z) and f(z) are polynomials, we write 


f(x) = fi(@) (mod m) (1) 


if the coefficients of each power of x in f and f, are congruent (mod m). 
For example, 
(a + a)? = x? + a? (mod 2), 
and 
x(x — 1) = (w — 3)(x + 2) (mod 6). (2) 


This meaning of the congruence symbol is usually intended when there is 
no reference to numerical values of x, or to roots or solutions of the con- 
gruence. The other meaning of the symbol is that x is a number for which 
the numerical values f(x) and f(x) are congruent (mod m). 

It should be noted that the connection indicated above between the 
two meanings of equality does not extend to congruences; that is, (1) 
is not equivalent to 


f(x) = file) (mod m) _ for all z, 


since, for example, x* = x (mod 8) for all x, whereas obviously x and x 
are not “algebraically” congruent (mod 3). 

There are other ways as well in which polynomial congruences behave 
differently from polynomial equations. It is a theorem (though possibly 
not one familiar to the reader) that every polynomial with integral co- 
efficients factors in a unique way into irreducible polynomials with in- 
tegral coefficients. The congruence (2) above shows that this is no longer 
true for polynomials whose coefficients are integers (mod 6), since the two 
factorizations given for x? — x are genuinely different, while the linear 
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factors are clearly incapable of further factorization, and so are irreducible. 
We also see from (2) that there is no general analog of the familiar theorem 
from algebra that the number of roots of a polynomial equation is equal 
to the degree of the polynomial, since the quadratic congruence xz? — xz = 
0 (mod 6) has the four solutions z = 0, 1, 3,4 (mod 6). If there were 
fewer solutions than the degree would indicate, there might be some 
hope of finding further ones by considering larger number systems, in 
much the same way that the equation r? + 1 = 0, which is not solvable 
in integers, or even in rational numbers or in real numbers, becomes 
solvable when complex numbers are allowed. But there is not much to 
be done when there are too many solutions. 

The reverse situation, in which there are too few solutions, also occurs, 
of course. The congruence x? +1 = 0 (mod m) has two solutions when 
m = 5, namely t = -+-2 (mod 5), but it has none for m = 7. These ex- 
amples show also that the strong distinction between real and complex 
numbers sometimes disappears in this “modular” arithmetic, since —1 
already has a square root (mod 5)! 

It turns out that some degree of order returns if we restrict attention to 
congruences with prime moduli: polynomials have unique factorization, 
they have no more roots than the degree would indicate, etc. For this 
reason we shall frequently consider theorems valid only for this particular 
case, although some, such as the following one, are true in general. 


Turorem 3-13 (Factor theorem). If a is a root of the congruence 
F(z) = 0 (mod m), 
then there is a polynomial g(x) such that 


f(x) = (x — a)g(x) (mod m), 
and conversely. 


Proof: Suppose first that a is a root of f(x) = 0 (mod m), and apply 
the ordinary long-division algorithm learned in algebra to divide f(x) by 
xz — a. Since the leading coefficient in the divisor is 1, no fractions will 
ever be encountered in the process. Since the divisor is of degree 1, the 
process can be. continued until the remainder is a constant, say 7, and we 
obtain an identity of the sort 


f(x) = (x — a)g(x) + 8. 


The polynomials on the left and right are algebraically equal, and there- 
fore they are algebraically congruent modulo m: 


f(z) = (x — a)g(z) + 7 (mod m). 
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Setting + = a (mod m), we have 
f(a) = 0 = 0- g(a) +r (mod m), 
whence r = 0 (mod m), and 
f(x) = (@ — a)g(x) (mod m), 


as asserted. If conversely this last congruence holds, then clearly 
f(a) = 0 (mod m). A 


THEOREM 3~14 (Lagrange’s theorem). The congruence 


f(x) = 0 (mod p) 
in which 
f(z) = age™ + ++++ an, a #0 (mod p), 


has at most 7 roots. 


Proof: For n = 1, the assertion follows from Theorem 3-10. Assume 
that every congruence of degree n — 1 has at most n — 1 solutions, and 
that a is a root of the original congruence. Then 


f(z) = (& — a)q(z) (mod p), 


where q(x) is of exact degree n — 1. By the induction hypothesis, g(x) 
therefore has at most m — 1 zeros, say ¢i,...,¢r, Where r <n — 1. 
Thus, if ¢ is any number such that f(c) = 0 (mod p), then 


(c — a)g(c) = 0 (mod p), 
so that either 
c = a (mod p) 
or 
g(c) = 0 (mod p). 


In the latter case, c = c; for some 7, 1 < ¢ <r. In other words, the 
original congruence has at most r-+1 <n roots. The theorem now 
follows by the induction principle. aA 


PROBLEMS 


1. Let f(z) be a polynomial of degree n, with integral coefficients. Show that 
if n + 1 consecutive values of f(z) are divisible by a fixed prime p, then plf(z) 
for every integral z. Cf. Problem 1, Section 3-2. 
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2. Find all solutions of z12 = 1 (mod 13). [The computation of high powers 
is best accomplished by using the binary expansion of the exponent, e.g., 


22 = 4, 
W= v= BZ 
8 = 37 =9, 


212 = 28+4 = 27=1 (mod 13). 


3. Using only the number of solutions in the preceding problem, and ignoring 
any information about the behavior of individual solutions, show that if d|12 and 
d < 12, then the congruence 2? = 1 (mod 13) has exactly d solutions. [Hint: 
Factor x12 — 1 as (z4 — 1)q(x) and apply Lagrange’s theorem.] 


The following seven problems sketch a proof of the unique factorization 
theorem for polynomials (mod p). If a(z) is a polynomial, we shall mean by 
deg, a(x) the degree of the highest-degree term in which the coefficient is not 
divisible by p. In particular, if a is a constant not divisible by p, then 
deg,a = 0. If pla, the symbol deg, a is not defined. 

4. Let a(x) and 6(x) be nonzero polynomials with deg, a(x) = n> degpb = m. 
Show that for a suitable constant co, either deg, (a(x) — cox"~™b(z)) <n or 
a(x) — cox™~"b(x) = 0 (mod p). 

5. Prove by induction that if a(x) and b(x) are nonzero polynomials (mod p), 
then there are q(x) and r(x) such that a(x) = b(x)q(z) + r(x) (mod p) and 
either r(z) = 0 (mod p) or degpr(z) < degy d(x). (Hint: Put ri(x) = a(x) — 
cox" d(x); if deg, ri(x) > deg, b(x), the procedure of Problem 4 can be 
repeated.] 

6. Let a(x) and 6(z) be nonzero polynomials (mod p), and let d(x) be any 
polynomial of minimum degree such that for some g(x) and h(x), 


a(z)g(z) + b(z)h(z) = d(x) (mod p). 


(a) Show that any two polynomials d(z) and do(x) with these properties 
differ only by a constant factor. (Hint: By an appropriate choice of g(x), go(x), 
h(x) and ho(z), the leading coefficients of both d(x) and do(x) can be made 1. 
Consider d(x) — do(x).] 

(b) Show that if di(x)|,a(zx) [that is, if a(x) = di(x)q(z) (mod 7p) for suitable 
q(x)| and di(z)|pb(z), then di(z)|pd(z). 

(c) Show that d(z)|,a(z) and d(z)|,b(z). (See Problem 12, Section 2-2.) We 
write (a(z),b(z))p = d(x). 

7. Show that every polynomial a(z) with deg, a(z) > 1 can be represented 
as a product of one or more polynomials irreducible (mod p). 

8. Show that if a(x)|pb(x)c(z), and (a(z), b(z))p = 1, then a(z)|pc(z). 

9. Show that if P(z), Pi(z),..., P,(z) are polynomials irreducible (mod p), 
and P(z)|> Ti; P(z), then for at least one i, P(z)|, P.(z). 

10. Show that the representation of a polynomial a(x) with deg, a(x) > 1 as 
a product of polynomials irreducible (mod p) is unique, except for the order of 
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factors and the presence of constant factors. [Note that 6(z)|p,c(z) and 
c(z)|» b(z) together imply only that b(z) = A-c(z) (mod p) for some con- 
stant A.] 

11. Find an example of polynomials a(z) and b(x) which are relatively prime 
(mod p) but not (mod q), for suitable primes p and q. 


3~7 Quadratic congruences with prime modulus. The general problem 
of higher-degree congruences is too difficult for further development here, 
but we can obtain some information in an elementary way in the special 
case of quadratic congruences. Consider the congruence 


az? + br+c¢=0(modp), pa, (3) 


where it is now supposed that p is an odd prime. (The case p = 2 is of 
no particular interest, since the only distinct quadratic polynomials are 
then x”, x? + 1, x? + wand 2? + x + 1, and the solutions of the corre- 
sponding congruences can be given explicitly.) Since p}4a, the congruence 
(3) is equivalent to 


4a7x? + 4abz + 4ac = 0 (mod p), 
and hence to 
(2ax + b)? = b? — 4ac (mod p). 


Let b? — 4ac = d. If the congruence u? = d (mod p) is not solvable, 
then neither is (3). On the other hand, if wu; is a number such that ue= 
d(mod p), then the integer x, such that 2ar; + b = u; (mod p) is a 
solution of (3). Conversely, every solution of (3) is related to a solution 
of u? = d (mod p) by such a linear congruence 2ax + b = u (mod p). 
Since this linear congruence has exactly one solution 2 for each u, we see 
that there is a one-to-one correspondence between the solutions of (3) 
and those of u2 = d (mod p), and we may as well restrict our attention 
to the latter. 
Changing the notation slightly, consider the congruence 


x? = a (mod p). (4) 


If this congruence is solvable, it would be reasonable to say that a is a 
square, modulo p, but for historical reasons the customary phrase is 
“quadratic residue”: a is a quadratic residue of p if (4) is solvable; other- 
wise a is a quadratic nonresidue. (Analogous definitions hold for nth- 
power residues and nonresidues.) We shall now develop a criterion for 
deciding whether a is a quadratic residue of p: 

TuHeorEeM 3-15 (Muler’s criterion). A necessary and sufficient condition 

that a be a quadratic residue of the odd prime 7 is that the congruence 


aP—H!2 = 4 (mod 2) 
hold. 
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Proof: Instead of (4), we first consider the congruence 
bx = a (mod p), (5) 


in which b is one of the numbers 1, 2,..., p — 1. This linear congruence 
is always solvable, since p}b, and the solution is unique if we require that 
it also be one of the numbers 1, 2,..., @ — 1. Let the solution be « = 0’. 
For fixed a, the numbers 6 and b’ will be called associates. We must 
distinguish two cases, depending on whether some 6 is associated with 
itself or not. 

If for some b, say 6, we have 6; = bj, then (5) becomes bj = a (mod p), 
and we have a solution of (4). Furthermore, in this case, (p — b,)? = 
p? — 2pby + = =a (mod p), and since b} # p — by, we obtain 
two distinct solutions of (4). By Lagrange’s theorem there are no others, 
so that for all 6 different from 6, and p — 6,, we find that 6 is different 
from its associate b’. Thus if a is a quadratic residue of p, the integers 
1,...,p — 1 can be grouped into (p — 3)/2 pairs of distinct associates, 
the product of the elements of each pair being congruent to a (mod p), 
together with the two numbers 6; and p — 6. Hence 

pl 
(p — 1I)t= J] b= a??? . bp — by) = —a?—”? (mod p). (6) 
k=1 


On the other hand, if a is a quadratic nonresidue of p, the numbers 


1, 2,...,p — 1 can be grouped into (p — 1)/2 pairs of distinct asso- 
ciates, and 
p-—l 
(p — 1)! = Jf k= a?-”!? (mod p). (7) 
k= 


In order to give a uniform statement of (6) and (7), we define the 
Legendre symbol (a/p) {also frequently written (*) or (alp)] to mean 1 
Pp 


if a isa quadratic residue of p, and —1 if a is a quadratic nonresidue of p. 
{Here a is called the “first entry,” and p the “second entry.” Note that 
(a/p) is not defined if pja.] Then (6) and (7) become 


(p — 1)! = —(a/p)a?—”"? (mod p). (8) 
Taking a = 1, and noting that the congruence x? = 1 (mod p) has the 
solution x = 1, so that (1/p) = 1, we have (p — 1)! = —1 (mod p). 


Substituting in (8) yields 

(a/p)a?—”? = 1 (mod p), 
or since (a/p) = +1, 

(a/p) = a??? (mod p). & 
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In the course of the proof, we also obtained the following theorem: 


Turorem 3-16 (Wilson’s Theorem). If pis prime, then (p — 1)! = —1 
(mod p). 

(Strictly speaking, we have proved this only for odd p; but the proof 
is trivial for p = 2.) 


PRoBLEMS 


1. Show that if p = 1 (mod 4), then (a/p) = (—a/p). 

2. Evaluate the Legendre symbols (5/17), (6/31), and (8/11). 

3. For what primes p is the congruence x? = —1 (mod p) solvable? 

4. Show that if pla and pd, then 

(a) (a?/p) = 1, 

(b) (a/p) = (b/p) if a = b (mod p), 

(c) (ab/p) = (a/p)(b/p). 

5. Solve the following congruences, or show them to be unsolvable: 

(a) 322 — 54 +7 = 0 (mod 18), 

(b) 52? — 62 + 2 = 0 (mod 18), 

(c) 22-+ 7z-+ 10 = 0 (mod 11). 

6. Use the fact that 7 = —(p — j) (mod p) to pair off the factors in (p — 1)!, 
and thus obtain from Wilson’s theorem a solution of x? = —1 (mod p) when 
p = 1 (mod 4). 


CHAPTER 4 
THE POWERS OF AN INTEGER, MODULO m 


4-1 The order of an integer (mod m). The sequence of powers of a 
fixed positive integer a is a special case of the more general geometric 
progressions studied in algebra. These successive powers are distinct inte- 
gers if a > 1, and they increase quite rapidly. We shall now study the 
sequence which results when the powers are all reduced to their least posi- 
tive remainders (mod m), where m is an integer relatively prime to a. 
Here again, as in the preceding chapter, there are problems whose solu- 
tions for composite modulus are too complicated for inclusion in this text, 
and these will be discussed only for prime modulus. The reader should 
take care to be aware of this restriction whenever it is present. 

We begin with a specific case, the sequence of powers of 2, reduced 
(mod 17). The following congruences hold, the modulus 17 being under- 
stood throughout: 


2° =1, 

2'=2, 

2? = 4, 

23 = 8, 

2* = 16, 

2° = 2-16 = 32 = 15, 
2° = 2-15 = 30 = 13, 
27=2-13 = 26 =9, 
22=2-9= 18 =1, 
2°=2-1=2, 
27° = 4 


We see that there is no point in continuing further, because the sequence 
is already repeating itself; since 28 = 1 (mod 17), we have 2°*/ = 28.27 = 
2) (mod 17), and hence any two powers of 2 whose exponents differ by 8 
(or a multiple of 8) are congruent to each other (mod 17). In other words, 
the sequence is periodic from the beginning, with period 8. The length of 
the period is the smallest positive exponent n such that 2” = 1 (mod 17). 

What has happened here is entirely typical: if (a,m) = 1, the least 
positive residues (mod m) of a°, a’, ... always form a periodic sequence, 
and this sequence is always periodic from the beginning. To see this, note 
first that while there are infinitely many powers of a, there are only the m 
integers 0, 1, ..., m — 1 for them to be congruent to, and hence some 
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two powers of a must be congruent to each other. Suppose that a’ = a’ 
(mod m), where r > s. Then a‘(a’* — 1) =0(modm), and since 
(a?,m) = (a,m) = 1, we must have a7 *=1(modm). But then 
a’—*t! = a, a’—*t? = a”, etc., so the sequence is surely periodic. 

Moreover, 1, which is the first element of the sequence, is also the first 
number to repeat. For suppose the opposite: that the second occurrence 
of 1 is at the power a”, and that for some r and s withO < s <r < n, we 
have a” = a’ (mod m). Then, just as before, we can deduce that a’~* = 1 
(mod m), which contradicts the definition of n, since0 <r —s <n. 

The most obvious problem remaining, then, is that of determining the 
length of the period. This length cannot be predicted in general, although 
for specific a and m it can, of course, be found by computing the sequence. 
We can, however, get some useful information about the period length, 
the simplest fact. being that it is always less than m; for if it were not, 
the m numbers a°, a!, ..., a”~ would be distinct and different from 
0 (mod m), which is clearly impossible. 

We call the length of the period the order of a (mod m), or the exponent 
to which a belongs (mod m), and we write ord, a; as we have seen, the order 
can also be defined as the smallest positive integer n such that a” = 1 
(mod m). To see what values can be expected for the order of a (mod m), 
consider for example the various sequences of powers (mod 19): 


TaBLE 4-1 

a a a at a® a8 a a& a® a! gl! gl? gl3 gi4 qld al6 gi7 gis 
1 

2 4 8 16 18 7 14 #9 18 17 15 11 38 6 12 5 10 1 
83 9 § 5 15 7 2 6 18 16 10 11 14 4 12 17 18 «#1 
4 16 7 #9 17 11 6 5 1 

5 6 11 17 9 7 16 4 #1 

6 17 7 4 5 11 9 16 1 

7 li 1 

8 7 18 il 12 1 

9 5 7 6 16 11 4 17 I 
10 5 12 6 8 11 15 17 18 9 14 7 138 16 8 4 2 1 
ll 7 1 
12 11 18 7 8 I 
13 17 12 4 14 11 10 16 18 6 2 7 15 5 8 9 3 1 
14 6 8 17 10 7 8 4 18 5 138 11 2 9 12 16 15 1 
15 16 12 9 211 18 5 18 4 3 7 10 17 8 6 14 1 
169 11 5 4 7 17 6 I 
17 4 11 16 #6 7 56 9 1 


Here the orders or period lengths occurring are 1, 2, 3, 6, 9, and 18, 1e., 
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exactly the divisors of 18. Thus we know that the period length always 
is at most m — 1, and that for m = 19 all divisors of m — 1 occur as 
period lengths. The latter, however, is not a general phenomenon, for 
consider the case m = 10: 


a a® a® at 
1 

39 7 JI 
79 3 =I 
9 I 


The numbers 1, 3, 7, 9 constitute a reduced residue system, and the lengths 
of the periods of their powers (mod 10) are 1, 2, and 4. Again we have all 
divisors of a number as period lengths, but this time the number is not 
m — 1. The correct analogy between the two cases is this: when m = 19, 
there are 18 elements in a reduced residue system, and the period lengths 
all divide 18; when m = 10, there are 4 elements in a reduced residue 
system, and the period lengths all divide 4. In these two cases all divisors 
actually occur, but this is not always true, as we see by examining the 
case m == 12: (12) = 4, but 5? = 7? = 117 = 1 (mod 12), so that all 
periods are of length 1 or 2. 
We now prove the general theorems bearing on these data. 


TurorEeM 4-1 (Fermat's theorem). If pta, then 
a?—! = | (mod p). 
Since y(p) = p — 1, this is a special case of 
THEOREM 4-2 (Euler’s theorem). If (a, m) = 1, then 
a’) = 1 (mod m). 


Proof: Let cy, ..., yim) be a reduced residue system (mod m), and let 
a be prime to m. Then acy, ..., @Cgm) is also a reduced residue system 
(mod m), and therefore 


ii ac; = vif c; (mod m), 
i=l 


t= 1 


etm m 
a™ Ti c= vif c; (mod m). 
i=1 i=l 


Since (m, IIc;) = 1, this implies that 


whence 


a®*™ = 1 (mod m).zA 


4-1] THE ORDER OF AN INTEGER (MOD ™m) 65 


TurorEM 4-3. If a = 1 (mod m), then ord» alu. 


Proof: Put ord, @ = é, andletu = gi +7r,0 <r <é. Then 


a” = attr = (q')%-a" = a" = 1 (mod m), 


and if r were different from zero, there would be a contradiction with 
the definition of f. A 


THrorEM 4-4. For every a prime to m, ord» a|g(m). 
Proof: The assertion follows immediately from Theorems 4—1 and 4-3. & 


It is convenient to prove here a theorem which we shall use in the next 
section, and which, in a certain sense, generalizes Fermat’s theorem. 
If we consider the polynomial congruence z2?—! = 1 (mod p), we see from 
Fermat’s theorem that there are exactly p — 1 roots, namely x = 1,..., 
p — 1 (mod p), and, by Lagrange’s theorem, this is the maximum number 
of permissible roots. The following theorem introduces other polynomials 
having the maximum numbers of roots. 


TuEoRrEM 4-5. If p is prime and d divides p — 1, then there are exactly 
d roots of the congruence 


a? = 1 (mod p). 
Proof: Since d|(p — 1), we have 
a?-? — 1 = (24 — Ig(z), 
where q(x) is a polynomial of degree p — 1— din zx. By Lagrange’s 
theorem, the congruence 


q(x) = 0 (mod p) 


has at most p — 1 — d solutions. Since 2?~! = 1 (mod p) has exactly 
p — 1 solutions, 2? = 1 (mod p) must have at least 


p-1—-@w-1-dad)=d 
solutions. Since it can have no more than this number, it must have 


exactly d solutions. A 


PROBLEMS 
1. Show that if ab = 1 (mod m), then 


ordn @ = ord» b. 
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2. Show that if p is an odd prime and ord , a = 2t, then 
a = —1 (mod p). 


Demonstrate that this need not be true if p = 2. 
3. Show that if p is an odd prime and at = —1 (mod p), then a@ belongs to 
an even exponent 2u (mod p), and ¢ is an odd multiple of u. 

*4, Show that if p is an odd prime and p|(z?”+ 1), then p = 1 (mod 2°+). 
Deduce that there are infinitely many primes congruent to 1 modulo any fixed 
power of 2. 

5. Show that fora > landz > 0, nlg(a” — 1). 

6. Compute (a) ordig 12, (b) ordg1 3, (c) ordio 7. 

7. Show that the congruence f(z) = 0 (mod p), of degree m < p, has m 
distinct roots if and only if f(z)|»(2? — x). [The notation is that of Problem 6(b), 
Section 3-6.] 


8. Find all roots of 25 = 1 (mod 31) without computation, using the fact that 
2 is a root. 


4~2 Integers belonging to a given exponent (mod 9). 

TurorEM 4-6. If ord, a = t, then ordm a” = t/(n, t). 

Proof: Let (n, t) = d. Then since a’ = 1 (mod m), we have 
(aty"!2 = (a*)“!# = 1 (mod m), 


so that if ord, a” = ?’, then 


t 
uM —e 
ae (1) 
But from the congruence 
(a")" = 1 (mod m), 
we have t|nt’ by Theorem 4-3, or 
iin, 
| Ry, 
Since 
tn 
(a) = 
we obtain 
tl, 
a\’ (2) 
Combining (1) and (2), we have 
t 
Oe es 
t d A 
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For example, we see from Table 4-1 that 


ordi9 2 = 18, 
ordi9 27 = 9, 
ordi 2° = 6, 
ordi 2* = 9, 


ordi9 926 = ordig 13 = 18, 
and correspondingly, 
18 


is, = ® 
155 =r 
aE 5 = 18. 


We also see from Table 4-1 that the integers having order 18 are exactly 
the powers 2° of 2 for which (18, a) = 1, namely 2', 2° = 13, 27 = 14, 
q11 = 15, 213 = 3, and 2!” = 10 (mod 19). This is a special case of the 
next theorem. 


Turorem 4-7. If any integer belongs to ¢ (mod p), then exactly ¢(¢) 
incongruent numbers belong to ¢ (mod 7). 


Proof: Assume that ord,a = t. Then by Theorem 4-4, ¢|(p — 1), 
and hence by Theorem 4-5 there are exactly ¢ roots of the congruence 
a’ = 1(modp). But all the numbers a, a”, ..., a’ are roots of this con- 
gruence and since they are incongruent (mod p), they are the only roots. 
By Theorem 4-6, the powers of a which belong to ¢ (mod p) are the num- 
bers a” with (n,t) = 1,1 < n < #, and there are precisely ¢(¢) of these 
numbers. A 


Tuzorem 4-8. If ¢|(p — 1), there are g(t) incongruent numbers 
(mod p) which belong to ¢ (mod p). 


Proof: Let d run over the divisors of p — 1, and for each such d let ¥(d) 
be the number of integers among 1, 2,..., p — 1 of order d (mod p). 
By Theorem 4-4 and Fermat’s theorem, each of the integers 1, 2,..., 
p — 1 belongs to exactly one of the d, Hence 


YS ¥@=p-1. 


d\(p—1) 
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But, by Theorem 3-9, we also have 


S ed) = p—1, 
d\(p—1) 
whence 
> = DY o@. 
di(p—i) d((p—1) 


By Theorem 4-7, the value of ¥(d) is either zero or g(d) for each d, and we 
deduce from the last equation that ¥(d) = ¢(d) for each d dividing p — 1, 
since otherwise the first sum would be smaller than the second. A 


If ord, @ = o(m), then a is said to be a primitive root of m. (As noted 
earlier, for example, the primitive roots of 19 are 2, 3, 10, 13, 14, 15.) 
The importance of this notion lies in the fact that if g is such a primitive 
root, then its powers, 
ein 


Oa skia 


are distinct (mod m), and are all relatively prime to m; they therefore 
constitute a reduced residue system modulo m. Thus we have a convenient 
way of representing all the elements of a reduced residue system, some 
implications of which are to be found later in this chapter and in the 
problems. 

It follows immediately from Theorem 4-6 that the other primitive roots 
of m are those powers g* for which (k, ¢(m)) = 1. Either from this 
remark or from Theorem 4-8 we have 


TurorEM 4-9. There are exactly o(p — 1) primitive roots of a prime p. 


The question of just which moduli have primitive roots is not altogether 
simple. Without going into details of the proof, we record the answer: the 
numbers having primitive roots are exactly those of the forms 2, 4, p*, 2p*, 
where p is any odd prime. We shall use g as a symbol for these numbers 
throughout the remainder of the present chapter. 

The problem of actually finding a primitive root, for large modulus, 
has not been solved, in the sense that no simple algorithm leads straight to 
a solution. For given modulus gq, it is, of course, a finite problem which can 
be solved by successively testing the elements of a reduced residue system. 
A slightly more rapid method is indicated in Problem 3 at the end of the 
next section, but this is also laborious for large g, particularly if ¢(q) has 
many distinct prime divisors. 


PROBLEMS 
1. Show that if ord, a = t, ord,b = u, and (f,u) = 1, then 
ord, (ab) = tu. 
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2. Show that if p = 1 (mod 4) and g is a primitive root of p, then so is —g. 
Show by a numerical example that this need not be the case if p = 3 (mod 4). 

3. Show that if p is of the form 2” -+- 1 and (a/p) = —1, then ais a primitive 
root of p. [Hint: What are the conceivable orders of a?] 

4. Show that if p is an odd prime and ord, a = ¢ > 1, then 


t-1 
> a’ =—1 (mod p). 
k=1 


5. Show that if g has primitive roots, there are o(y(q)) of them, and their 
product is congruent to 1 (mod gq) if g > 6. [Hint: Represent all primitive roots 
in terms of a single one.] 

6. Find all primitive roots of 25. 

7. Find a primitive root of 23 and then, using Theorem 4-6, all primitive 
roots of 23. 


4~3 Indices. Let g be a number having primitive roots and let g be 
one of them. Then the numbers g, g?, ..., g®” are distinct (mod q), and 
they are all prime to q; therefore they constitute a reduced residue system 
(mod gq). The relation between a number a and the exponent of a power of 
g which is congruent to a (mod q) is very similar to the relation between 
an ordinary positive real number « and its logarithm. This exponent is 
called an index of a to the base g, and written “ind, a”. That is, if (a, q) = 1, 
then ind, a will stand for any number ¢ such that g' = a (mod q). It is 
only determined (mod ¢(g)), since a't*@ = a‘ (mod gq). The following 
facts are immediate consequences of the definition. 


TutorEeM 4-10. If g is a primitive root of g then 


ind, a = ind,b (mod g(q)) if a= 6b (mod 4q), 


ind, (ab) = ind, a + ind, b (mod ¢(q)), 
and 
ind, a” = n ind, a (mod ¢(q)). 


The procedure for finding the indices of the elements of a reduced 
residue system is quite simple if a primitive root is known. If g is a primi- 
tive root of g, construct a table of two rows and ¢(q) columns, of which the 
second row consists of the integers 1, 2, ..., o(q), in order. In the first 
row enter g in the first column. Multiply this by g and reduce modulo q 
for the element in the second column; multiply this result by g and reduce 
modulo g for the element in the third column, ete. (When the table is 
complete, the last element in the first row should be 1.) Then the index 
of any element of the first row appears directly below that element. 
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If, for example, q = 17 and g = 3, we have the table 


a:{3/9/10/13/5|]15)11)16/14} 8] 7} 4}12) 2) 6] 1 


inda:}1};2| 3} 4/5] 6} 7} 8} 9) 10)11)12)13) 14) 15) 16 


whereas for g = 18 and g = 5, we have 


inda:|} 1/2] 3] 4] 516 


By Theorem 4-6, if ordm g = ¢(m), then 


¢(m) 


(n, o(m)) ’ 


so that a is a primitive root of m if and only if (ind a, g(m)) = 1. Thus 
in the above table we see that the primitive roots of 18 are 5 and 11, 
since the only numbers less than ¢(18) = 6 and prime to it are 1 and 5. 

Indices are quite useful in solving binomial congruences. For example, 
the congruence 


ord, g” = 


10z = 8 (mod 18) 
implies 
5x = 4 (mod 9), 
which in turn implies 
ind 5 + ind x = ind 4 (mod 6), 
ind x = ind 4 — ind 5 (mod 6). 


Since 2 is a primitive root of 9, we construct the table as before: 


Thus 
ind s = 2 — 5 = 3 (mod 6), 
whence 
a = 8 (mod 9), 
so that : 


x = 8 or 17 (mod 18). 
We can also use indices to study the special polynomial congruence 


x” = c (mod p); 
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we have already considered the case n = 2 in the preceding chapter. 
This congruence is entirely equivalent to 


n-ind x = indc (mod p — 1), 


which has solutions if and only if (n, p — Jind c; if this condition is 
satisfied there are d = (n, p — 1) roots. Such a criterion has the disad- 
vantage that it requires knowledge of the value of indc, and for this 
reason the following is more useful. 


TuEorEeM 4-11. Let (c, g) = 1, where g is any number which has primi- 
tive roots. Then a necessary and sufficient condition for the congruence 


x” = c (mod g) (3) 
to be solvable is that 
c?@/4 = | (mod q), 
where d = (n, 9(q)). 


Proof: By an argument similar to that just given for prime modulus, a 
necessary and sufficient condition for the solvability of (3) is that ind c = 0 
(mod d). This is equivalent to 


2) ind c = 0 (mod ¢(q)), 


or, what is the same thing, 
e*@/4 = | (mod q). & 


If 2” = c (mod m) is solvable and (m, c) = 1, then c is said to be an 
nth-power residue of m, otherwise a nonresidue. 


TuHrorEeM 4-12. The number of incongruent nth-power residues of q 
is (q)/d, and these residues are the roots of the congruence 


ald = 1 (mod q). 


Proof: The second statement is a paraphrase of Theorem 4-11. Since g 
has a primitive root g, the roots of the congruence 2%?/¢ = 1 (mod q) 
are the numbers g‘ for which 


gla = 1 (mod q), 


and this requires that d't. But the number of multiples ¢ of d with 1 < 
t < ¢(g) is exactly y(g)/d. (Note that this is a generalization of Theo- 
rem 3-15.) A 
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PRoBLEMS 


1, Given 2 as a primitive root of 29, construct a table of indices, and use it to 
solve the following congruences: 

(a) 5% = 21 (mod 29), (b) 17z = 10 (mod 29), 

(c) 1722 = 10 (mod 29), (d) 2? = 20 (mod 29), 

(e) 22 — 4¢ — 16 =0(mod 29), (f) 172? — 3x -+ 10 = 0 (mod 29), 

(g) 172 — 4n-+ 1 = 0 (mod 29), (h) 2” = 17 (mod 29). 

2. Decide whether each of the following congruences is solvable: 

(a) 25 = 3 (mod 31) (b) z3 — 322+ 32 — 8 = 0 (mod 19). 

3. Let ¢ be a number having primitive roots. Show that / is a primitive root 
of q if and only if his an rth power nonresidue of q for every prime r dividing ¢(q). 
[Hint: Write h = g*, where g is a primitive root of g, and show that each of the 
allegedly equivalent statements is equivalent to the equation (k, o(g)) = 1] 
By eliminating all the appropriate powers of. the elements of a reduced residue 
system, find all primitive roots of 13 and of 29. (Note the connection with 
Problem 3, Section 4-2.) 

4, Show that if g and h are two different primitive roots of p, then 


ind,a@ = ind, a- ind, g (mod p — 1). 


CHAPTER 5 
CONTINUED FRACTIONS 


5-1 Introduction. Much of the content of the preceding chapters de- 
pends, in the end, on the division theorem, Theorem 1-1. We now return 
to this theorem as the source of yet another important range of ideas in 
number theory. For convenience, we change the notation slightly. Let 
s and t be nonzero integers; then Theorem 1-1 asserts that there are unique 
integers a and r such that 


s=ta+r, O0O<r<t. (1) 


It is useful now to describe the pair a, r by a condition on a rather than by 
the above inequality involving 7, and to do so we write 


§ r r 


We see from these relations that @ must be chosen as the largest integer 
which does not exceed s/t, and conversely, if @ is so chosen, then the 
difference s/t — a is a fraction which is nonnegative and smaller than 1; 
its denominator is t and its numerator is the integer r of (1). The notion 
of the largest integer not exceeding a given real number « occurs repeatedly 
in the theory of numbers, and it has been dignified by a special notation: 
the largest integer not exceeding x is designated by [z]. In this notation, 
we see that the integers a and r satisfying (1) are a = [s/t] and 


mene) ely) 


The importance of this new way of looking at the division theorem lies 
in the possibility of generalizing (2) by allowing an arbitrary real number 
to replace the rational number s/t. That is, for every real number x we 
can write 

r=at+n, 0<2, <1, (3) 
if we choose a = [x], and (3) reduces to the division theorem when z is 
rational. In view of this, it is natural to ask whether there is also an analog 
of the Euclidean algorithm for real numbers. Returning to the rational 
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case for a moment, we see that the Euclidean algorithm can be written in 
the following form: 


s rT Tr 
p= 4+?) 0<2<1, 
by rT TL 
hei aaae 0 on < 1, (3) 
To Ta re 
as ee en OT 
TN —2 
= @4 
T'N—1 x 


In the first equation, s/t can be any rational number at all, so ay) = [s/t] 
can be any integer, positive, negative, or zero; but because of the inequali- 
ties, the remaining integers a1, ag, .. . are all positive. If we put z = s/t, 
21 = t/ro, 2 = ro/r1,..-, then we can write 


oe Gate ee x1 > 1, 
vy 
1 

m= ato tq > i, (4) 
2 


1 
ec ER z3 > 1, 


and this makes sense even if x is not a rational number. Of course, in this 
more general case there is no reason that the process must terminate, 
and in fact we see immediately that it cannot do so, since every zp is then 
irrational and therefore never exactly equal to the integer ap. 

Let k be a positive integer which is otherwise unrestricted if x is irra- 
tional, but is smaller than N if x = s/t is a rational number and the 
equations (3) hold. If from the first k of the equations (4) we eliminate 


1, %g,..., e—1, we obtain the relations 
1 
x= a9+ = 
; 1 
= a+ 1 Qo 1 
a — a 
it a i+ at . 
- ; 
+ 1 
ay + — 


5-1] INTRODUCTION 75 


On the other hand, if x is rational, equations (3) lead to the relation 


x= Ay + 
a2 + (6) 


For example, starting with x = 7/4 = 0.785398. .., we find 


1 
«= 0 + 7373800...’ 
1 
1.273820... = 1 + s@E599—? 
1 
3.65202... = 3 + 7ER5¢g~’ 
1 
1.53368... = 1 -+ 1.8737...’ 
and hence 
a. 1 
-— = 
a 
3+ 1 
1+ 19737... 


this being the expansion (5) with k = 4. 

The complicated fractions occurring in equations (5) and (6) are called 
continued fractions or, more precisely, finite continued fractions, because 
there are only finitely many a;. The latter integers are called partial 
quotients, while the numbers 2; are called complete quotients. A continued 
fraction such as (6) in which only integers appear, all of them except 
possibly ao being positive, is said to be simple. We shall be principally 
concerned with simple continued fractions, but to use (5) effectively we 
shall first prove some theorems concerning (6) which are valid whether or 
not the numbers a; are integers, so long as all of them, except possibly ao, 
are positive. 


PROBLEMS 


1. Find the simple continued fraction expansions of the following numbers: 
(a) 81/35, (b) 21/18, (c) 5, (d) —86/31, (e) 1/7. 

2. Find the expansion (5), with & = 4, of the following numbers: (a) 7, 
(b) V3, (c) (1 + 4/8)/2, (A) 277/101. 
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3. Prove the following theorems concerning the greatest-integer function. 
Here and in the next problem, x and y are real numbers and n is an integer: 

(a) z = [z]-+ 6, whereO < 6 < 1; 

(b)2—-1< [2] <2 < [2] +1; 

(c) [e+ 2] = [z] +2; 

(d) [[z]/n] = [2/n] for n > 0. 


4, (a) Graph the following functions: 


(i) y = [2], 

(ii) y = [—a], 
(iii) y = —[—a2l, 
(iv) y = [22]. 


(b) Show that [z] + [yl < [z+ y]. 
(c) Show that —[—z] is the smallest integer not less than a. 


5. The reduced fractions 
Po _ ao Pl 1 p2 1 


ae, aot) = ag +—? 
1 a 1 
qo q1 1 q2 Pet ie 
a2 


are called the convergents of the continued fraction (6). Show, for the numerical 
examples of Problem 1, that always 

Pn Qn | 
Pntl In+1 


= +1. 


6. Let z be a number between 0 and 1. Let ai be the smallest positive integer 
such that the difference 
m=2—-— 
ay 


is nonnegative, let a2 be the smallest positive integer such that the difference 


1 
t2 = 1 — — 
a2 


is nonnegative, etc. Show that this leads to a finite expansion 


(that is, that c,41 = 0 for some ”) if and only if z is rational. 

7. (a) If m and n are positive integers, show that the number of multiples of 
m not exceeding n is [n/m]. (b) Let p be a positive prime and 7 a positive integer. 
Show that the power of p occurring in the prime decomposition of n! = 
1-2-3---+nis 

pint Pl tial pelt! pipe 


(c) Find the power of 2 occurring in 10!, and also the power of 5. With how 
many zeros does the decimal expansion of 10! end? of 100!? 
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5-2 The basic identities. Let zo, 21,..-, 2 be real numbers, all of 
which, except possibly the first, are positive, and consider the continued 
fraction 


z= 29+ 


Zo (7) 
: 1 


oe 
te-i+7 


Now clearly x is determined completely by the z’s, so we shall abbreviate 
the cumbersome equation (7) by writing x = {Z9;21,...,2%}. The 


reason for the semicolon in this notation is to emphasize the distinction 
between (7) and the continued fraction 


1 = {0} 20, 21,---, 2eh3 


— 
Ta 


moreover, the number preceding the semicolon plays a rather different 
role from the other z’s in that it can be zero or negative. By placing 
parentheses around the fraction 2,1 + 1/z; at the bottom of (7), we see 
that 


1 
{203 21). ++) 2e—1, 2k} = {2% 21, +++) Se—2) Ze—1 + 1 . (8) 
The continued fractions 
{203}, {203 21}, {20} 21, Za},.- +» {803 21) 2a) -+ +4 2} 


are called the convergents of the expansion (7). If we simplify the first few 
to ordinary fractions, we obtain 


z 
{zo;} = ma 


1 
. Z20%1 + 1 
{205 21} = a a ? 
Cyne = 202122 “t 20 “b 22 
? 5) — 
2122 -++- 1 


We define the numbers p, and qn, forn = 1,..., k, as being the numera- 
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tors and denominators of the fractions just written, so that 


Po=%, gd=1, 
Pi = 20% + 1, n= %, 
Po = 22122 -+ 20 + 22, qa = 2122 + 1, (9) 


and refer to pp and g, as the numerator and denominator of the nth 
convergent of (7). (Note that this is a genuine definition, because the 
ratio of 2z9 to 2 is the number {z9;}; but, according to this definition, 
2z9 and 2 are not the numerator and denominator of {20;}. 
Returning to the numerical example given in the preceding section, we 
have 
20> 0, 4 1, 22> 3, 23 => 1, 24 = 1.8737..., 


and hence 
Po = 9, go = 1, Po _ 9, 
qo 
m= 1, a= 1, f= 1, 
3 
p2 = 3, ge = 4, ol 
- = Pz _ 4 
ps = 4, q3 = 5, q3 5? 
_ = Pa 4eg+3 _ 7 
Pa = 424+ 3, G4 = 524+ 4, te fete 4 


TuroreM 5-1. The numerators p, and the denominators g, of the 
nth convergent of (7) satisfy the equations 


Po = 20, Pi = 20%1 +1, Dn = Pa-ien + Pn—2 for 2<n<k, 


and (10) 

wo@=1, n= 4, Qn = Qn—-12n + Qn—2 for 2En<k. 
In particular, 

— Pete + Pe-z ig ED Q 11 

* Qh—12k + Qz—2 : Be eet ay 


Proof: Equation (11) follows from (10) by taking n = k and noting 
that x = pz/qz, So we need only prove (10). This we do by induction on n. 
According to (9) we have, for n = 2, 


Piz + Po = 22(2o21 + 1) + 20 = Da 
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and, similarly, q1z2 + go = ¢2. Now suppose that the equations involving 
n in (10) are correct, and that n < k.. Using the principle illustrated in (8), 
we obtain 


1 
Enth = {293 21,..., 2.41 = feos tn, +520 + 
en+1 


1 = 
p ) + Pn—2 (Pn—12n + Pn—2) + Eeet 
ntl eee 


Pn—-1 (<, + 


1 — = 
Gn—1 (% a ) + dn—2 (Qn—12n + dn—2) + deat 
n+l ent 


1 


2n+1 Pnéent+i + Pn-1 
cinta eines See etre inte 
Goi Qn—1 Qnén41 + Wr—1 


and this gives the required expressions for pn41 and qn41. & 

One says that the sequences po, 71, P2, -- - aNd go, 41, 92, - - - are defined 
recursively by equations (10), because each element after the second in each 
sequence is defined in terms of earlier elements. The Fibonacci sequence 
discussed in Section 1-3 was also defined recursively. 


THEorEM 5-2. We have, for1 < n < k, 
PnIn—1 — Pn-19n = (—1)*73, (12) 


or equivalently, 


ig Pant 
Qn Qn-1 QnQn—1 


Proof: We again proceed by induction on n. Equation (12) holds for 
n = 1, by (9). Forn > 1 we see from (10) that if 


n—-2 


Pn—19n—2 — Pn—29n—-1 = (—1) ’ 
then 


Pndn—1 — Pa—-19n = (Pn—12n + Pn—2)4n—1 —_ Pn—1(Gn—12n + Yn—2) 


= Pn—2n—1 — Pn—1%n—2 = —(—1)"? 
= (—1)""a 


Turorem 5-3. For 2 <n <k, 


Pn _ Pn-2 _. (=1)"%n . 
Qn Qn—2 Gn In—2 
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The proof of this theorem follows exactly the same lines as that of 
Theorem 5-2, and is left to the reader. 


TuHrorEM 5-4. The convergents are related to each other and to x 
by the following inequalities: 


PO 2 0B8: Se tie yee eee es 

Go G2 q3 q1 
except that the last convergent, py/qn, is equal toz. That is, the even 
convergents, Po/go, P2/d2,-.-, form a strictly increasing sequence of 
numbers, with none larger than zx, and the odd convergents, p1/q1, 
p3/q3,-.-, form a strictly decreasing sequence, with none smaller 
than z. 


Proof: From (10) we see that all g’s are positive. Hence it follows from 
Theorems 5-2 and 5-3 that the differences 


Pn __ Pn—i and Pn __ Pn—2 
Qn Qn—1 In Qn—2 


have opposite signs, which means that each convergent lies between the 
two preceding ones. It is clear that po/go < pi/q1, SO we successively 
obtain 

Po - P2 - Pi, 

Go q2 M1 

Po - P2 - Ps - Pi, 

qo q2 93 q1 

Po - P2 - Pa — Pa - Pi, 

qo G2 q4 q3 91 


and so on. Since 2 itself is either the largest of the even convergents or the 
smallest of the odd convergents, the theorem follows. A 

The recursion relations (10) provide a simple procedure for actually 
calculating the successive convergents of a continued fraction. Consider 
for example the continued fraction {3;1,4,2,7}. We construct the 
following table: 


n: O 1 2 3 4 
Qn: 3 1 4 2 7 
Dai -eccee 40.5 a 
Qn: 1... Les: : 


Here po, 11, Yo, @1 have been computed in accordance with (10). To 
determine p2 and qe, we multiply a2 by p; and add po, obtaining 19, and 
we multiply a2 by q; and add qo, obtaining 5, as indicated by the dotted 
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lines. Continuing in this fashion, we complete the table: 


n: 0 1 2 3 4 
Qn: 3 1 4 2 7 
Pn: 3 4 19 42 313 
dn: 1 1 5 11 82 


Thus {3;1, 4, 2,7} = 313/82, and the convergents are 3/1, 4/1, 19/5, 
42/11, 313/82. 
It should perhaps be mentioned that if we define 


p-2 = 0, p-1 >= 1, 
g2=1, g-1=9, 
then equations (10) can be written more simply as 


Pn = Pn—120 + Pn—2 
for n> 0. 


dn = Qn—12n + Gn—2 
This also simplifies the construction of tables of convergents such as that 


above, since it is no longer necessary to work out po/qo and pi/q1 sepa- 
rately. For example: 


n —2 —1 0 1 2 

An 3 1 4 

Pn 0 1 3 4 19 

dn 1 0 1 1 5 
PROBLEMS 


1. Compute the convergents of the following continued fractions: 

(a) (3; 7, 2, 1, 1, 2}, 

(b) {1; 2, 8, 4, 5}, 

(c) {1;1, 1,1, 1}. (What if the 1’s were continued further?) 

2. Find the continued fraction expansions of 3.14159 and 3.1416. What can 
you say about a continued fraction expansion for +? 

3. Suppose that all numbers z, in (10) are positive integers. Then show that 
for n > 0, we have (pa, dn) = 1. 

4. Prove Theorem 5-3. 

5. Show that qn/q@n—1 = {2njZn—1,-.-, 21}. 

6. Show that pa/pn—1 = {%n}Zn—-1,---, 21, 20} ifzo > 0, whereas ppa/Pa—1 = 
{2nj Zn—1,---, 22} ifzo = 0. 
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7. If there is any sense to be attached to the equation = {1;1, 1, 1,...},. 
then clearly s = {1; 2}. Use this to find the only possible value for z. Similarly, 
find the only possible value of {2; 3, 2, 3, 2,3, ...}, and of {3;4,1,4,1,4,1,...}. 
Can you make (and prove) a general statement about the values of all non- 
terminating simple continued fractions in which the z,’s form periodic sequences 
of positive integers? 


5-3 The simple continued fraction expansion of a rational number. 
We now return to the simple continued fractions, in which the partial 
quotients are positive integers, and consider first the expansion of a 
rational number as such a fraction. We have seen that by eliminating the 
remainders in the Euclidean algorithm, we obtain the finite expansion (6), 
so that every rational number has a finite simple continued fraction 
expansion. There remains the possibility that there are several such ex- 
pansions for a single number, and it is even conceivable that there is a 
nonterminating simple continued fraction which in some sense represents 
the number, just as the nonterminating decimal 0.333... represents 1/3. 
The latter possibility will be eliminated in the next section; for the moment 
we restrict attention to the finite case. 


TurorEM 5-5. There is only one finite simple continued fraction 
{ao;@1,...,4y} whose value is a specified rational number 2, if it is 
required that ay be larger than 1. The only other finite simple continued 
fraction with value z is {a9;@1,...,@y — 1,1}. 


The ambiguity described is illustrated by the following expansions of 
4/11: 


4 = 1 1 
11 24 1 F 2+ 1 ; 
1S 14+—{ 
2+5 
The idea of the proof is very simple. Since {0; a@1,..., ay} is a number 
between 0 and 1 for all a;,..., ay, the integer a such that 
x= {A9;a1,..., ay} 
for suitable a1,...,@y is uniquely determined: it is [z]. Thus any two 


expansions of x agree in the first, partial quotient, and we can subtract 
this common part and go on to the next partial quotient, where the unique- 
ness argument can be repeated. A formal inductive proof follows. 


Proof: Suppose that x = {ao;@1,...,an} = {bo;bi1,..., bu}, and 
that ay > 1, bu > 1. If we set a, = {@n3Qn41,-.., ay}, then in 
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analogy with (8) we can write 
t= {a0} 01,..-, @n—1, An}, 1<n<VN, (13) 


and similarly, for b, = {bnjbn41,.--, ba}, we have 
+ 


c= {bo; 01, .--, bn—1; bt, 1 < n < M. (14) 
For n < N, we obtain 
1 
a, = An + > Anyi > 1, (15) 
Gn41 


so that a, = [a/]. Similarly, for n < M, we find b, = [bj]. Taking 
n = 0, we have aj = bj = 2, so that a9 = bo = [x]. We proceed by 
induction. Suppose that 


a5 = bo, ay= b1, eeey Oni = ba—1- 


Then if pr—1/Qn—1 = {40} 41,..-, @n—1}, we find by (11), (13), and (14) 
that 


t= Pn—10n -+ Pn—2 = Pn—10n ++ Pn—2 : 


Qn—10n + Qn—2 Qn—10n + Qn—2 
whence 


(Pn—1dn + Pn—2)(Gn—1bh + Qn—2) = (Pn—1d, + Pn—2) (Qn—1@n + Gn—2); 
(Pr—29n—1 — Pn—19n~2) On = (Pr—2dn-1 — Pn—19n—2) On} 


thus, by (12), b, = a, But then a, = [a,] = [b;] = ba. Hence all 
partial quotients are the same, and when one expansion terminates, so 
does the other. 

This argument breaks down if any @,41 = 1, since that possibility was 
excluded from (15). If in fact af4,; = 1, then a, = [az] — 1, any = 1, 
and the expansion terminates at this point. Therefore the preceding 
complete quotients a, were larger than 1 (otherwise the expansion would 
have terminated earlier), and the preceding partial quotients a, and b; 
were equal. A 

According to Theorem 5-5, a rational number s/t has two finite simple 
continued fraction expansions {a@9;a@1,..., ay}, and in one of them N is 
even and in the other N is odd. If we apply Theorem 5-2 with n = N, 
we have py/qn = s/t and gy_18 — py—it = (—1)¥~!. Thus we arrive 
at 


THEOREM 5-6. The linear diophantine equation sr — ty = 1 has the 
solution 2 = qy_1 and y = py_1 if s/t = {a@o;@1,...,aw} and N 
is odd. 
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When s and ¢ are large, this is probably the quickest method of solving 
the linear diophantine equation. 


PROBLEM 


1. Use the method of this section to solve the following diophantine equations: 
(a) 31542 — 297ly = 1, 

(b) 3154z -++ 2071y = 45, 

(c) 31416z -+- 10000y = 8. 


5-4 The expansion of an irrational number. We have seen that for 
irrational x the algorithm (4) leads to the equations (5), and the latter 
are of precisely the same form as (13), if a, = 2p. For irrational x, the 
complete quotients x, are always larger than 1, so that the argument 
following (13) shows that if for arbitrary n, 


x= {A093 @1,..., Gn, tai} = {bo; bi, ..., bn, Troi}, 


where the a; and b; are integers, and all—with the possible exception of 
do and be—are positive, then a, = b, fork = 1,...,n. If the algorithm 
(4) is continued indefinitely, an infinite simple continued fraction {a9; 
@1, @g,...} results, and there is just one such fraction, corresponding to 
each irrational number x. The question we must now face is, what sense 
can be made of the equation x = {a9} a1, ag, ...}? 


Turorem 5-7. If the infinite simple continued fraction {a9; a;, a2, .. .} 
is associated with z by means of equations (4), and this continued frac- 
tion has convergents po/qo, P1/@1,---, then limp Da/dn = x. That 
is, the difference between x and the nth convergent approaches 0 as n 
increases without limit. 


Proof: If we put x = {a9} @1,-.., Qn, Xn41}, then we obtain from (11) 


Pn __ Pn®n+1 + Pn-1 _ Pn _ (=)? (16) 


tm es 
dn QnXn41 + In—1 Qn Qn (n2n41 + dn—1) ; 


and hence, since g, increases without bound as n — oo, 


lim (x — 7) = 0.4 
NWH n 

If the sequence of convergents of an infinite continued fraction con- 

verges to a certain number z, we say that the value of the continued frac- 

tion is x, or that the continued fraction converges to, or equals, x. What 

we have just shown is that to every irrational number z, there corresponds 

a unique infinite simple continued fraction whose value is z, and that this 
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continued fraction is generated by the algorithm (4). The following 
theorem gives the complementary result. 


TurorEeM 5-8. Every infinite simple continued fraction converges to 
a real number. 


Proof: Let the continued fraction be {@9; a1, @2, ...}, with convergents 
21/91, P2/G2,---, and let X be the rational number pp/qn. Then X has 
the expansion {a9} @1,..., 1}, and this finite continued fraction has the 
same convergents pz/q, a8 the infinite expansion, fork = 1,...,n. Thus 
the inequalities of Theorem 5-4 hold in this same range (with x replaced 
by X), and we deduce that the even convergents of the infinite expansion 
form an increasing sequence bounded above by 71/q:, for example, and 
the odd convergents form a decreasing sequence bounded below by p0/qo- 
A fundamental principle concerning infinite sequences of real numbers is 
that every increasing sequence which is bounded above is convergent, 
and that every decreasing sequence which is bounded below is convergent. 
Hence the limits 


lim 222 and dim 22e+4 


no Jon no J2n+1 


exist, and to prove that limn_,«. Pn/@n exists, it suffices to show that they are 
equal. To this end we invoke Theorem 5-2, with 2n in place of n: 


From equations (10) we see that 
go = |, a2l &@ 2 mit G-2, 


and from these relations it is easy to prove by induction that gn > for 
n= 1,2,.... It follows from (15) that 


lim Ge = a) = 0, 


no \Qan G2n-1 
and since the separate limits are known to exist, we have 


lim 22" = jim P2n-2. 
no Jon nwo Y2n—1 


Hence lim p,/¢n exists. A 
TurorEeM 5-9. The simple continued fraction expansion of a rational 


number is always finite. Equivalently, the value of an infinite simple 
continued fraction is always irrational. 
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Proof: We have seen that an infinite simple continued fraction converges 
to a real number x, and that the continued fraction results by applying 
the algorithm (4) to this number x. If x were rational, the expansion 
would be finite, since it is then just the Euclidean algorithm. ~ 


Tueorem 5-10. If z is irrational, the sequence {pon/@en} is an increas- 

ing sequence with limit z, and the sequence {pon41/gen41} is a decreas- 

ing sequence with limit xz. Moreover, each convergent is closer to x 
than the preceding one: 

ee 

Qn 


<|2— Me], for n> 1. 


nm—1L 


Proof: The first sentence of the theorem is simply a combination of 
Theorems 5-4 and 5-8. Using (16), we have 


1 


i ie = 
lant — Pal QnEn41 + In—1 
Since z is irrational, tn41° > [241] = an41, and hence 


Gninzi + Qn—1 > QnOn41 + Qr-1 = M41; 
whereas 


Gningi + Gri < In(On41 +1) +@-1= Qn+1 +a S Qn+2: 
Thus 


] 
—— < [Qnx — < 
Qn+2 dn Pal Qn+1 


Since the gn form an increasing sequence, it follows that 
lant — Dn| < |Qn—1t — Pn—ail, (17) 
and this is a stronger inequality than that of the theorem. a 


PROBLEMS 


1. Reconsider Problem 7 of Section 5-2. 


2. Show that the continued fraction expansion of V3 is periodic, and compute 
the first few convergents. Proceed similarly for 1/13. 


3. Is there any sense to be made of the equation 


1 5 
IME Wawa 


2 


4. Prove that (17) implies the inequality of Theorem 5-10. 

5. What information can you get about the Fibonacci numbers from the equa- 
tion (1 + /5)/2 = {1;1,1,1,...}? How is it connected with the inequality 
tn < (7/4)" proved in Chapter 1? (Hint: Compute a number of convergents.] 
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6. Describe by inequalities the set of real numbers having a fixed set of 
integers ao, a1, ..., dn as their first n + 1 partial quotients. 

7. For x > 1, show that the kth convergent of the expansion of 1/z is the 
reciprocal of the (k — 1)-convergent of the expansion of z. 


5-5 The expansion of quadratic irrationalities. Decimal expansions 
of rational numbers are always either terminating or periodic, and we have 
just seen that simple continued fraction expansions of rational numbers 
always terminate. We shall now see that the infinite periodic continued 
fractions correspond exactly to the real quadratic irrational numbers, 
these being the real irrational numbers which are solutions of quadratic 
equations ax? + be + ¢ = 0 with integral coefficients a, b, c. According 
to the quadratic formula, such numbers are of the form z + y/d, where 
x and y are rational and d is a positive integer, not a square. 

Consider for example the number § = »/7. Designating the complete 
quotients by £1, &,..., we have 


— 


24+(V77—2), a =2 & = (V7—2)% 


~—1 
i ga ED, ae ie Mien) | 


VI —2 3 3 

3 v7+1_ ,,VvV7=-1 = vie) 
eon 3 = 1-1, a= 5 
2 V7+i_,,vi-2 “Sy 
Wo ee a eR a 
2 a V4 2 R= 44 (VI-M, = 4, B= VT 2) 
Vi —2 
Since &5 = &1, also & = &, &7 = &3, ..., so the sequence {&} (and 


therefore also {a,}) is periodic. Thus we have the periodic expansion 
a0 ge Pra ee Pe Sto I ae Se 


Using the relations (10),.we construct the following table: 


k}olil2/3] 4] 5] 6 
a1 2)V 2a oP] alee. 
m|2/3/5/8137|45|/82/--- 
a |1)il2l3|a4| a7 lar! 
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Here the element 37 = p4, for example, is determined by multiplying 
ag = 4 by ps3 = 8 and adding pe = 5. Thus the convergents to \/7 are 
3, 5/2, 8/3, 37/14, 45/17, .... 

Conversely, consider the continued fraction 


E= {15 3, 1, 2, ¥ 52, «if 
where dan = 1 and Gen41 = 2forn > 1. We have 


£2 = {1, 2, 1, 2, sea = {15 2, Eo}, 


so that 
- 1 be 3&2 +1 
Ree og oh ee ep ee ae” 
2+> 
2) 
2 — 2 —1=0, 
—1+ V3 
b= 


(The plus sign is chosen before the radical since 2 > 0.) Hence 


We can now show that these are not isolated phenomena. 


THEOREM 5-11. Every eventually periodic simple continued fraction 
converges to a quadratic irrationality, and every quadratic irrationality 
has a simple continued fraction expansion which is eventually periodic. 


Proof: The first part is quite simple. Suppose that the first period begins 
with an, and let the length of the period be h; then az4, = a; fork > n. 
Set 

i= {ao; @,...} and & = {Qx; Ok+1,+--}, 


so that &4, = & fork > n. By this and equation (11), 


— Pnoikn + Pn-2 _ Pn+h—1én Pnth—2 | 
Qn—1§n + In—2 Gn+h—1'n + Qn+h—2 


and hence & satisfies a quadratic equation with integral coefficients. 
Since £, is obviously not rational, it is a quadratic irrationality. Again by 
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(11), the same is true of £ itself, since if 


A+ Bi, +C=0, 
then 
A(~—qn—2§ + Pn—2)* + B(—Gn—2§ + Pn—2)(Gn—1— — Pn—1) 
+ CQn—1& — Paail” = 0, 
and this is a quadratic equation in &. 
The proof of the converse involves a little more computation. Suppose 


that 
f(t) = AP? + BE+C=0, 


where A, B, and C are integers, and £ is irrational. Then equation (11) 
yields 
A(pe—1te + De—2)” + B(pe—1te + Pr—2)(qe—ite + Ye—2) 


+ C(qr—1tk + M%~2)” = 0, 
or 


Axti + Byte + Cy = 0, 
where the integers A;, By, and C; are given by the equations 
Ax = Apk_1 + Bpr—ige—1 + Cok—1, 
By, = 2Ape—1pe—2 + B(pe—19e—2 + Pr—29e—1) + 2CQu—19%-2, 


Cy = Apk_2 + Bpr—ogr—2 + Cak_2. 
Thus 


Ap = 2. Pk—1 d CG. = a. Pi—2\ . 
k Kaif ra) ani k Gk—2 f a) 


We now use the identity 
au? + bu+ ec = a(u — v)? + (2av + b)(u — v) + (av? + bv + 0), 


which is easily verified by multiplying out on the right and collecting 
terms.* Choosing u = pr_i/gqg_1 and v = &, and using the fact that 
at? + be + c = 0, we obtain 


At = G1 Ca = 2) {208 hehe tog (Gan _ :)\ 


* This is also, of course, just the Taylor_expansion of f(u) near the point v. 
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Now by (16), 


Pk—-1 1 1 ‘ 
es, = 18 
| é Ge—1 | gr—1(Ge—18e + Gk—-2) ~ g?_y : us) 
so that 
[Ae| < [20g +5] +-2L, 
Ge—1 
and similarly, 
ial 


ICel < |2a¢ + b| + -5 
Thus |A;| and |C,| remain bounded as k > o. 

To see that |B,| is also bounded, we use the fact that all the quantities 
B? — 4A,C; have the common value B? — 4A4C = D. (This can be 
proved by a straightforward but tedious computation or, if one is acquainted 
with the theory of linear transformations, by noting that the expression 
Apa’? + Byx’y’ + Cry’? is obtained from Ax? + Bry + Cy? by the 
unimodular substitution 


a= pert’ + pe-2y’, YY = M18" + e—2y’, 


and that two such forms have the same discriminant.) Since A, and C;, 
are bounded and D is fixed, 


BR = D+ 4A,C; 


must be bounded also. 
Thus, there is a constant M such that 


|A,| < M, |B,| < M, Ck] < M 


for all k. Since there are fewer than (2M + 1) triples of integers each 
numerically smaller than M, there must be three indices, say n1, 2, and 
3, Which yield the same triple: 


An, = Anz = Ans Bn, = Bry = Bnss Cay = Cag = Cog. 


Since the equation An,r? + Bn,z + Cn, = 0 has only two roots, two of 
the numbers £n,, ng, ng Must be equal; with suitable notation, they can 
be taken to be &, and &,, where ny < Mo. If nz — ny = A, then 
Enith = Eni and 

1 2s i 
Eni +h —_ [En 44] - Ena >= 

1 = 1 

Enithgi — [émancal  Sm4i1 — [Eni+il 


Eny+h+1 = [Ens = Eny +1) 


bny theo = = fnytay 
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and, in general, £4, = & for k > m. Thus the &’s are eventually 
periodic. Since each a, is determined exclusively by the corresponding &,, 
the same is true of the a,’s, and the proof is complete. a 


PROBLEMS 


1. Find the continued fraction expansions of s/d for d = 2, 3, 5, 6, 7, 8, 10. 
Make all possible general conjectures which are consistent with these data and 
which seem reasonable to you, and test them against the casesd = 11, 12, 18, 14. 
Can you prove any of your conjectures? 

2. The equation x? — x — 1 = 0 can be solved by transposing and dividing 
by «: 


x= 


oti _ 1 1 _1+Vv5 
go ee eye er are 


Solve the equation az* — abz — 1 = 0, in which a and D are positive integers, 
by continued fractions. Try to find other equations which lend themselves to 
this approach. 

3. Find the continued fraction expansions of s/n? + I and s/n? + n, where 
n is a positive integer. 

4. Prove the assertion made in the text that for k > 2, 


2 


Bi — 4A,C, = B? — 4 AC. 


5. Below is an outline of a proof that the expansion of /d (d a positive non- 
square integer) is periodic after ao. Fill in all details. (Ifa = r+ s/d, where 
rand s are rational, then &@ = r — sVd.) 

Put & = Vd-+ [Vd]. Then —1 < — < 0, and from the equation 


& = a, + 
ea 

it follows that —1 < & < Ofork > 1. This in turn shows that a, = [—1/& +1]. 
Now suppose that the periodicity of {&,} begins when k = n, and that the 
period is of length h, so that &, = fn4n. Consequently, an_1 = Gn4a—i, and 
hence a1 = &41~1, so that {&} is periodic from the beginning. 


5-6 Approximation theorems. Continued fractions provide a very 
powerful tool in that branch of the theory of numbers known as diophantine 
approximation. In this subject one is concerned not with equations but 
with inequalities; the basic problem is to discover what kinds of inequalities 
involving one or more variables have solutions in integers, or have in- 
finitely many solutions in integers. (The theorems are sometimes phrased 
in terms of rational numbers, but it comes to the same thing, of course, 
since rational numbers are quotients of integers.) For example, the 
fractions (not necessarily reduced) which have fixed denominator g are 
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equally spaced along the real axis, the distance between successive ones 
being 1/q: 
sey —2/q, —l/q, 0/q, 1/q, 2/4, ae 


Thus every real number z is at most a distance 1/2q from such a fraction, 
and we have the very simple theorem that for every real number x and 
every integer g, there is an integer p such that |gz — p| < 1/2. Here we 
have a diophantine inequality which is solvable for every value of one of 
the integer variables. Suppose that we asked only that such an inequality 
be solvable for infinitely many values of g; could we then do better than 
1/2? Is it, for example, possible to find a function f(g) which tends to 0 
as q increases without limit, such that the inequality 


lax — pl < f@ (19) 


has infinitely many integral solutions p, g? The answer to this question 
is not at all obvious, and it is one of the purposes of the present section 
to give an answer. The question can easily be refined or generalized in a 
variety of ways: find the function f(g) which approaches 0 most rapidly 
and for which (19) has infinitely many solutions for every x, or for every 
irrational x, or for every x of some other class; or replace the function 
qx — p by some other function of two variables, or by some function of 
more than two variables; etc. We shall not attack these more difficult 
questions here. 
The first problem posed above is answered by the following theorem. 


TuroreM 5-12. If x is a rational number s/t, then for every rational 
number p/q different from x, the inequality 


i 
q 


P 


q 


> 


holds. However, if x is irrational, then there are infinitely many integral 
solutions p, q of the inequality 


poh a 
qd qd 


0< 


Thus f(g) = 1/q is an appropriate function in (19), if z is irrational, 
whereas no function tending to 0 is allowable if 2 is rational. 


Proof: Tf x = s/t and p/q ¥ s/t, then 


Ape ea es 1 


eae 
t og lg tg 
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since |sg — tp| is a positive integer. On the other hand, if x is irrational, 
then it has a nonterminating simple continued fraction, and so infinitely 
many distinct convergents px/ge, and the desired result follows from (18). A 

The next theorem shows that the convergents of the continued fraction 
expansion of x, which we have just seen to be very good approximations to 
z, are in a certain sense the best possible approximations to x. 


TuHrorEeM 5-13. Ifn > 1,0 < q < gq and p/q ¥ pPn/dn, then 


lant — Dn| < lax — pl, (20) 


with strict inequality unless n = 1 and gn4i1 = 2. It follows that 
under the same circumstances, 

ea 

Qn 


<|2~—2). (21) 


Proof: Suppose first that g = qn. Then 


Pr _ Pl\yt 
Qn Qni ~~ Qn 
since p # pn. On the other hand, by (18), 
Pn 1 1 1 
aS ee ee Bis 
Qn Qn(QnEn+1 + Qn—1) ~ OnGn+1 ~ 20n 


with strict inequality unless gn41 = 2 and 2,41 = 1, and (21) follows 
from these two inequalities. When g = gn, (20) and (21) are equivalent, 
so the theorem is proved in this case. Henceforth we can assume that. q 
is not the denominator of any convergent of x. 

Clearly, we can also suppose that (p, g) = 1. Let k be the unique index 
such that g,1 <q < qe; then 1 <k <n. If we prove the strict 
inequality (20) with n replaced by k, then, by (17), it will also hold for n. 

Define u and v by the equations 


Mp, + VPE-1 = P, 
ge + Vgr-1 = Q} 


obviously they are not both negative. Since the determinant of this 
system is +1, it follows from Cramer’s rule that » and y are the integers 


DPD Pr—-1 Pr P 
+ and a+ ; 


q k-1 G 


and neither of these is zero. Since g = ugqx + Vqx—1 < qu, the integers 
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and v must have opposite signs. By Theorem 5-10, the numbers qzz — p;, 
and q,-12 — pr—1 also have opposite signs. Hence u(qxr — py) and 
v(Qe—12 — DPe—1) have the same sign. But 


gz — Pp = w(Get — pe) + V(Ge-1t — pea), 
so that 
lax — pl > |ge—it — Pei] > lee — pel. & 
The next theorem shows that the only solutions of (19) with f(q¢) = 1/2q 
are the convergents of x. 
THroreM 5-14. If 
P 


a—Fli<ss, 
q 


then p/q is a convergent pn/gn of x. 


Proof: Suppose that p/q is not a convergent. Then for some index k, 
—1 <q < qx, and, as was shown at the end of the proof of the pre- 
ceding theorem, 

lqe—1% — Pri] < |gx — pl. 


Since |gx — p| < 1/2g, we have 


_ Pr—-1 1 
gq — Pkol 
U—1 249% —1 
and also 
p 1 1 


t~— 


q | 2g? ~ 2qqn—1 
But these two inequalities imply that 


P Pei be 
q Vk—1 k—1 
whereas 
P _ Peoi| _ [pdr ~ ral, 1 


qd G1 Qk—1 ~ Qk—1 


This contradiction shows that p/q is a convergent. A 


PROBLEMS 


1. Why are 22/7 and 355/113 such useful approximations to a? 


2. Show, using continued fractions, that if a is a quadratic irrationality, then 
there is a constant ¢ such that for every pair of integers p and q with q > 0, 
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3. Prove the theorem of Problem 2 without continued fractions. [Hint: Let the 
quadratic equation defining a be f(z) = ax?-+ ba-+e = a(z — a) (t — a’) = 0, 
where a, b and ¢ are integers. Then |g?f(p/q)| is a positive integer, and therefore 
at least equal to 1.] Generalize the theorem and proof to higher-degree irration- 
alities. 

4, Show that of two consecutive convergents to x, at least one satisfies the 
inequality 


Dp 1 

F | S33 

{Hint: Show first that 
Pott _ Pn| _ |p —& + |e — Bat, 
Qn+1 dn qn Qn+1 


and then give a proof by contradiction.] 

5. Below is a sketch of the proof of a theorem. Fill in all details, and state the 
theorem. 

If x is a real number and q is an integer, then the “fractional part” qx — [gz] = 
f(q, 2) satisfies the inequality 0 < f(q, x) < 1. Asq takes the values 0, 1, 2,..., 7, 
there are n+ 1 points determined in the unit interval, and two of them must 
lie in some one of the n subintervals 


VS Ss ee ee 
nn n Nn 


The distance between these two points is less than 1/n. Hence... . 


6. Deduce the second half of Theorem 5-12 from the theorem of the preceding 
problem. 


CHAPTER 6 
THE GAUSSIAN INTEGERS 


6-1 Introduction. We shall now consider the arithmetical theory of the 
so-called Gaussian integers, these being simply the complex numbers 
a + bi in which a and 6 are ordinary integers. To keep matters straight, 
we shall refer to the numbers 0, +1, +2,..., which have been the subject 
of discourse up to now, as the rational integers, and designate the set of 
rational integers by Z. We designate the set of Gaussian integers by Z[i], 
and use lower-case Greek letters to denote the elements of this set. We 
sometimes refer to Z and Z[7] as domains of integers. 

The content of the first four chapters of this book is part of multipli- 
cative number theory, and it all eventually depends on the notion of 
divisibility. If we attempted to apply similar considerations to the rational 
numbers or fractions, instead of the rational integers, we should see im- 
mediately that everything becomes either trivial or nonsensical. For 
given two rational numbers A and B, with B = 0, there is always a 
rational number C' such that 4 = BC (namely C = A/B), and thus every 
nonzero rational number divides all others. Hence there are no primes 
and no GCD, every linear equation is solvable, every congruence is true, 
and so on. In other words, to have an interesting theory, it is necessary to 
work within a set of numbers in which division is not always possible 
{that is, not all quotients of elements belong to the set). On the other 
hand, it simplifies matters greatly if the usual rules of algebra concerning 
multiplication, addition, and subtraction, as embodied in postulates I 
through IV of Chapter 1, continue to hold. The Gaussian integers 
meet both of these requirements, and have, in addition, a number of 
special properties which make it possible to build up a theory remark- 
ably similar to that already developed for rational integers. The bulk of 
the present chapter will be devoted to this simple case, but in the final 
section we shall consider the more complicated situation in which the 
integers are taken to be the elements a + bv/10 of Z[/10]. 

These generalizations of “rational” number theory should not be regarded 
as mere curiosities. They are special instances of a much broader develop- 
ment, called the theory of algebraic numbers, in which one considers 
general algebraic irrationalities—roots of algebraic equations of all degrees 
—rather than just certain quadratic irrationalities. This theory, which is 
deep and difficult, is not only interesting in its own right, but it has many 
applications in rational number theory, and its more comprehensive view- 
point makes possible a real understanding of various phenomena in ra- 
tional number theory which would otherwise remain completely mysterious. 
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6-2 Divisibility, units, and primes. Let a and 8 be Gaussian integers, 
with 8 * 0. Wesay that 8 divides a, and write Bla, if there is an element ¥ 
of Z[i] such that a = BY. For example, (1 + 7)|2 since2 = (1+ )(1 — 2), 
and also (1 + 7)|(1 — 27) since 1 +7 = (1 — 2)-7. On the other hand, 
(1 + d+ + 22), for supposing the opposite leads to the equation 


1+2%=0+d0a+h) =(@—b) + @td)i, 


from which it follows, by comparison of real and imaginary parts, that 
a—b=IJanda-+ b= 2. Adding these equations, we obtain 2a = 3, 
which is false for every a in Z. 

We must verify that this definition of divisibility is consistent with 
that given earlier for rational integers, for otherwise we should have to use 
different symbols to indicate the domains with respect to which divisi- 
bility is asserted. That is, it is conceivable that 2/7 when 2 and 7 are re- 
garded as elements of Z[i], because there might be a Gaussian integer a 
such that 7 = 2a, even though there is no rational integer a such that 
7 = 2a. In fact, such a thing never happens. For ifaand b * Oare in Z, 
and bla in Z{i], then for some Y = c + di we have a = bY. But the 
equation 

a= bY = be+ bd 


yields a = be and bd = 0; thus d = 0, 7 is real and is therefore in Z, 
and bja in Z{i] implies bla in Z. 
If a = a+ bi, then the nonnegative rational integer 


(a + bi)(a — bi) = a? +b? 


is called the norm of a, and is designated by Na or N(a). By writing out 
the multiplication, it is easily verified that for every a and 8 in Z[{2], 


N(oé) = NaNs. (1) 


We express this by saying that the norm is multiplicative. It follows from 
(1) that if e/y, then Ne|NY. 

Certain Gaussian integers divide every Gaussian integer; these are 
called the units of Z[z]. In particular, a unit must divide 1. Conversely, if 
e{1, then € is a unit, for we then have 1 = en, and so for every a in Z[7] we 
can write a = a-1 = (ane, whence ela. So we can find the units of Z[2] 
simply by finding the divisors of 1. Now if ¢|1, then Ne/N1, so Ne = 1. 
The only solutions of the equation a? + b? = 1inZare +1, 0 and 0, +1, 
and hence the only units of Z[z] are +1 and +2. 

If € is a unit and «@ and 8 are elements of Z[z] such that a = Be, then a 
and B are said to be associates. Note that under this definition, @ is an 
associate of itself. 
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A Gaussian integer which has no other divisors than its associates and 
the units is said to be prime in Z[2] or, if the context is well-understood, 
it is simply said to be prime. This point has to be emphasized since in the 
present context the rational integers do behave differently when con- 
sidered as elements of Z or of Z[z], some of them being prime in the first 
domain and not in the second. For as we saw above, the rational prime 
2 splits in a nontrivial way, and is therefore not prime, in Z[z]: 


2= i(1 — 7)”. 


(Here the factor 1 — zis prime, because if 1 — 7 = af, then 2 = NaN, 
and so either Na = 1 or N68 = 1.) On the other hand, not every rational 
prime splits in Z[z]; for example, 3 does not. For if a3, then Na/9, so that 
if @ is neither a unit (with norm 1) nor an associate of 3 (with norm 9), it 
must be that Na = 3. But the equation a? + b? = 3has no solution in Z. 

Sometimes we shall distinguish the two notions of primality by referring 
to Gaussian primes and rational primes. 

We have so far only two examples of Gaussian primes: 1 — 7 (or its 
associates +1 + 7) and 3. We pause to show that, in fact, there are 
infinitely many primes in Z[z]. 


Tueorem 6-1. There are infinitely many rational primes of the form 
4k — 1. 


Proof: Every odd prime is congruent either to 1 or to —1 (mod 4). 
Hence every odd number is congruent (mod 4) to 1 or —1, according as 
the number of its prime factors of the form 4k — 1 is even or odd. In 
particular, if m = —1 (mod 4), then n contains an odd number of prime 
factors, and therefore at least one, of the form 4k — 1. 

Now let p,; be the kth prime, and set N = 4p\po-:-pn — 1. Every 
prime divisor of WN is different from any of p1, po, ..., Pn, and hence is 
larger than p,. By the preceding paragraph, N has a prime factor of the 
form 4k — 1. We have therefore shown that for every 7, there is a prime 
of the form 4k — 1 which is larger than p,. This implies the theorem. A 


THEOREM 6-2. There are infinitely many Gaussian primes. 


Proof: We prove this by proving a stronger statement, namely that 
infinitely many rational primes are also Gaussian primes. In fact, we 
shall show that every rational prime p = —1 (mod 4) is also a Gaussian 
prime. Suppose that p = oa, so that Np = p? = NeN@. If Na = 1, 
then aisa unit. If Ne = p’, then N@ = 1, s0 Bisa unit. Hence the sup- 
posed factorization is trivial unless Na = N8 = p. However, if a = a+ bi, 
we then have 

a? + b? = p, 
a” + b? = 3 (mod 4). 
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But a square is either 0 or 1 (mod 4), and no two numbers, each of which 
isOori,addto3.a 

We shall see later that the rational primes of the form 4k + 1 always 
split as the product of two nonassociated Gaussian primes, and that this 
exhausts the set of Gaussian primes. Thus every Gaussian prime is a 
factor of exactly one rational prime. 


PROBLEMS 


1. Show that Na|N8 does not imply af. 

2. Show that associates have the same norm, but that two Gaussian integers 
having the same norm need not be associates. 

3. Show that if Na is a rational prime, than a is a Gaussian prime. 

4. Show that (1+ 7)+(1-+ 22) by direct consideration of the fraction 
(1 + 2t)/(1 + 1) 

5. Show that (1 + 2){(1 + 2%) using the multiplicativity of the norm. 

6. Show that a Gaussian integer has only finitely many divisors in Z[#]. Find 
all the divisors of 10, and prove that there are no others. Do the same for 3 + 71. 

7. Use the multiplicativity of the norm to show that if each of two rational 
integers m and n is the sum of two rational squares, the same is true of the 
product mn. 


6-3 The greatest common divisor. Up to this point, nothing has been 
said about inequalities in Z[z7]. As a matter of fact, there is no way to 
introduce the relation “<” in Z[z] in such a way that the following two 
statements hold: 

(a) For any two elements a and 8 of Z[z], exactly one of the relations 

a < B,a = 6, ora > B holds. 

(b) Ifa < Band 0 < Y, then ay < BY. 

To see this, we note first that under any definition of “a < 8” which is 
consistent with (a) and (b), necessarily 0 < 1. For if not, then by 
(a) it must be that 0 < —1. But then by (b), with a = Oandg@= Y = 
—1, we have 0 < (—1)? = 1, a contradiction. 

We complete the proof of the asserted impossibility by showing that 
neither of the relations i < 0 or 0 < can hold. For if 7 < 0, then 
0 < —i, and hence 0 < (—i)? = —1, which is false. If 0 < 7, then 
0 < i? = —1, which is also false. 

Since the entire theory of rational integers developed earlier depended 
ultimately on Theorem 1-1, which involves an inequality, it is important 
to note that a weak sort of comparison of elements of Z[z| can be effected 
by comparing norms. This is what is done in the following analog of 
Theorem 1-1. 
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Turorem 6-3. If a and £ are integers of Z{z], and 8 ¥ 0, then there 
are x and p in Z[2] such that 


a= 6K +p, Np < NB. 
Proof: Since 8 ¥ 0, we can write 


a atbi_ (a+ b2)(c — di) 
B ctdi- c2 + d2 


= A+ Bi, 


where A and 8B are rational numbers, not necessarily integers. Let x 
and y be the rational integers nearest to A and B, respectively, so that 


|A —a <3, 


IB—yl < 


vir 


Then 
re (e+ ¥)| = |(A — 2) + (B— y)) 
= ((A— 2)? +(B—y)?)"? < G+ HM? <1, 
Hence, if we set 


r+yi=k, a-— Bx+yi) =p, 
then 
Np = N(a — x8) = Ns. N(Z — x) < NB, 


and «x and p are in Z[7]. A 
On the basis of Theorem 6-3, the Euclidean algorithm can now be 
generalized to Gaussian integers, as follows: 


a = BKi + p41, Np, < Ng, 
B = pike + po, Npz < Nou, 


Pr—2 = Pr—1Kk + Pr, Nox < Nopx_1, 
Pr—1 = Prke+1- 


The sequence of equations must terminate, because N, Noi, Noo, ... is 
a decreasing sequence of nonnegative integers. It can be shown that pz, 
the last nonvanishing remainder, is a divisor of both a and 8, by working 
up from the last equation to the first, and it can be shown that every 
common divisor of a and 8 is a divisor of pz, by working down from the 
first equation to the last, just as was done in Section 2-2. From the next 
to last equation, p, can be written as a linear combination of p,_1 and 


6-3] THE GREATEST COMMON DIVISOR 101 


Pr—2, and then px—1, Pe—z, - - . , 91 can be successively eliminated with the 
help of the earlier equations to yield p; as a linear combination of a and £. 
Thus pz is a Gaussian integer having all the properties of the number 6 
listed in the following theorem. 


Turorem 6-4. Let a and 8 be Gaussian integers, not both 0. Then 
there is an integer 6 of Z[7] with the following properties: 
(i) dla and 4|g. 
(ii) If 6’ is any integer such that 8’|a and 8/6, then 8’|4. 
(iii) There are — and 7 in Z[7] such that 6 = a& + fy. 
Any two integers 6, and 4, having properties (i) and (ii) are associates. 


Proof: It is only the uniqueness of 6 (except for a possible unit factor) 
which remains in doubt. Suppose that 5, and 62 are two Gaussian integers 
having properties (i) and (ii). Then since 6,/a and 6,|8, it follows from 
(ii), with 6’ = 8, and 6 = do, that 4,(52. By symmetry, we also have 
5/81. Hence 5; and 4 are associates. A 

Any Gaussian integer 6 having properties (i) and (ii) is called a GCD 
of a and B, and we write (a, 8) = 6. [Strictly speaking, we should of 
course write “(a, 8) = +6 or +76.”] If (a, 8) = 1, we say that a and 6 
are relatively prime. 

As an example, suppose we wish to find a GCD of 7 + lliand 3 + 5:2. 
We have 


7T+1li 76 — 2% =) 

Rae ga Be 
so that 

T+ 1 = B+ 5)2+5— B= (B+ 5)2+ (+9. 
Similarly, 

iia cg a Re 

1l+i 2 


8452= 4490420. 
Thus (7 + 117,3 + 52) = 142. 


PROBLEMS 


1. Compute the following GCD’s: 
(a) (16 — 22, 38 + 172), (b) (4+ 62, 7 — 7), (c) (6 +7, 4 — 34). 
2. Find the exact conditions under which (1 + 7)|(a + Bi). 


3. Express each of the GCD’s in Problem 1 as a linear combination of the two 
entries, with coefficients in Z[7]. 


4, Show that (ua, u8) = ula, 8) ify ¥ 0. 
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5. Show that if ula and ulB, then (a/u, B/u) = (a, B)/u. 

6. Show that if a is relatively prime to each of the numbers #1, ..., Bn, then 
ais relatively prime to B1--~- Bn. 

7. Show that if Bla, Yla, and (8,7) = 1, then BY|a. 

8. Find an analog of Theorem 6-3 in the domain Z[\/—2] of numbers of the 
form a+ b\/—2, with a and b in Z. Is Theorem 6-4 valid in this domain? 
What happens in the domain Z[./—5]? 


6-4 The unique factorization theorem in Z[i]. The next four theorems 
are analogs of the theorems of Section 2-3. The only significant change is 
that inequalities between integers have been replaced by inequalities 
between their norms. 


TuroreM 6-5. Every integer a with Na > 1 can be represented as a 
finite product of primes. 


Proof: The proof is carried out by induction on Ne. The smallest case 
to consider is Na = 2: the four Gaussian integers of norm 2 are associates 
of the prime 1 + 7, and are therefore primes themselves. Suppose then 
that a is an integer with the property that every integer of smaller norm 
(and not a unit) has a representation of the required sort. If a is itself 
prime, we are through. Otherwise a has a decomposition a = pY, with 
1 < NS < Na and 1 < Ny < Na. The induction hypothesis then im- 
plies that 


B= wind: ++ WS and Y= nine --+ ai, 


where the 7,’ and 7,’’ are primes in Z[1], and hence 
a= Wh: mir -- re 
TuroreM 6-6. If a/8y and (a, 8) = 1, then alv. 


Proof: If (a, 8) = 1, there are integers é and 7 such that af +- By = 1, 
and therefore avé + 8Yn = Y. But a divides both ay and BY, and hence 
the left side of the last equation, and therefore a divides 7. «A 


Tueorrem 6-7. If 7, 71,..., 1, are Gaussian primes, and m|7, -- + 7p, 
then for at least one m, m is an associate of Tp. 


Proof: Suppose that m/7,---7,, but that m is different from any of 
1, -.+, %-—1 or their associates. Then 7 is relatively prime to each of 
Ti, +--+, Wn—1, and so is relatively prime to the product m,-++ m—1. 
(This follows from the fact that if (a,8) = 1 and (a, 7) = 1, then 
(a, 8Y) = 1, an implication which in turn follows by multiplying together 
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the equations a& + By = 1 and ay + Yv = | to obtain 1 as the linear 
combination 1 = a(aéu + Bnu + yvt) + BY- nv of a and BY.) By 
Theorem 6-6, 7|7,, So that 7 and 7, are associates. A 


THEOREM 6-8 (Unique Factorization Theorem for Z[t]). The representa- 
tion of each Gaussian integer a with Na > 1 as a product of primes is 
unique except for the order of factors and the presence of units. 


Proof: Suppose that a is any element of Z[z] with Na > 1 and having 
the two factorizations 
a= M-°* Wy = Whe’ Ts. 
Then 7,|7{ -"- + 13, so by Theorem 6-7, 7, is an associate of one of the 7%, 
which we may as well take to be rj. Cancelling mw, from the above equa- 
tion, we obtain 


— / / 
— = Tq-** We = €57Q°°* Hs; 


where €; is a unit. The argument can now be repeated, with the primes 
from the two factorizations successively paired off and cancelled, there 
being equality to within a unit factor between the remaining prime prod- 
ucts at each stage. When all primes 7, ..., 7, have been eliminated, all 
primes must be gone from the other factorization too, with just a unit 
remaining, and this unit must in fact be 1. A 


PROBLEMS 


1. Show that if a is not prime, it has a prime factor + with Na < VNa. 

2. (a) List all Gaussian integers a = a-+ bi with Na < 9 which lie in the 
first quadrant, that is, those for which a > 0 andb > 0. Multiply the numbers 
in this list by 1 + 7, and find the associates.in the first quadrant of these multi- 
ples. Using Problem 1, find all Gaussian primes-7 with Na < 9. 

(b) Continuing, list all Gaussian integers a in the first quadrant with Na < 81, 
and then find all primes 7 with Na < 81. 

3. Find the prime decompositions of (a) 5 + 67, (b) 7 — 31. 


6-5 The primes in Z[i]. We have already seen that in Z[7], the rational 
prime 2 splits as the product of the associated primes 1 + 7 and 1 — 3, 
and that the rational primes p = 3 (mod 4) remain prime. All other 
rational primes are of the form 4k + 1, and we now turn our attention to 
these. 


TueoreM 6-9. Every rational prime p = 1 (mod 4) splits in Z[z] as 
the product of two nonassociated Gaussian primes. 
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Proof: Suppose that p = 1(mod4). Then according to Euler’s cri- 
terion (Theorem 3-15), —1 is a quadratic residue of p, and hence there is 
a rational integer x such that x? = —1 (mod p). Thus pi(z? + 1). Now 
in Z[7] we have 


eto+il=(«e—A)(e+), 


and if p were prime in Z[z], then it would have to divide one of the factors 
zx —tanda-+i. But this is obviously not the case, since pt+1. There- 
fore p splits in Z[z] into two non-unit factors. 

Suppose that there is a factorization 


Pp = af, (2) 


where we do not suppose that a and @ are prime, but only that they are 
not units. Then Np = p? = NaN§, and since Na and Ng are rational 
integers different from 1, it must be that 


Na = NB = p. (3) 
If a were not prime, we should have 


== %1%2°** Tr, r >], 
and hence 
p = NaiNr2---Nr,, 


which is impossible since p is a rational prime. Similarly, 8 must be prime. 
From (2) and (3) we have three decompositions for p, 


p = of = ad = 8B, 


and from the first two we see that 8 = @. Thus @ is also prime, and p is 
the product of two complex-conjugate Gaussian primes: 


p= TT. (4) 
It remains to be shown that these primes are not associated. Let 
r=a+ bi, 


and assume that 7 = em for some unit €. For the four units, we make the 
following deductions by comparing real and imaginary parts in the equa- 
tion a — bi = e(a + bi): 


e= 1: b= 0, 
e= —l a=0, 
€=2: a= -—bd, 
e=-1 a=b. 
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But none of these alternatives is possible, because of (4): 


=: 0 implies p = a’, 


implies p = b?, 
= —b implies p = a?(i — 1)(1 +7) = 2a?, 
=b implies p = a2(i+2)(1 — 7) = 2a? 


eae aoe 
| 
° 


Hence? # 7. A 

Equation (4) shows that every p = 1 (mod 4) is the norm of a Gaussian 
prime: p = (a + bi)(a — bi) = a? + b?. This has the following imme- 
diate consequence in rational number theory: 


TurorEem 6-10. Every prime of the form 4k + 1 can be represented as 
the sum of two squares, p = a” + b?, and this representation is unique 
except for order and sign of a and b. 


This theorem was discovered empirically by Fermat, and proved by 
Euler. There are proofs which are not based on Gaussian integers, but the 
present proof indicates clearly how the theory of other kinds of integers 
can be used to obtain information about the rational integers. 

We now show that we have found all the primes in Z[i]. 


Turorem 6-11. The associates of the following represent all the 
Gaussian primes: 
the associated divisors 1 + 7 and 1 — 7 of 2, 
the rational primes p = 3 (mod 4), 
the nonassociated conjugate prime divisors a + bi and a — bi 
of the rational primes p = 1 (mod 4). 


Proof: Let a be a Gaussian prime, and let Na = ait = a. The rational 
integer a can have at most two rational prime factors, since otherwise it 
would have more than two prime factors in Z[7], which is not the case. 
Moreover, if a = pg, where p and q are rational primes, then p = gq. 
For suppose that p ~ g. Then |p| = |q|, whereas || = |#|, and the 
unique factorization theorem for Z[i] is violated, whether or not p and q 
are also prime in Z[z]. . 

Thus either Na = p? or Na = p, where p is a rational prime. In the 
first case, Ti = p-p; so 7 = # = p, and as we know, this happens 
exactly when p = 3 (mod 4). On the other hand, if Na = p, then and 
# are the nonreal Gaussian prime factors of p, and there are such factors 
when p = 2 and when p = 1 (mod 4). a 


PROBLEMS 


1. Using the theorems of this section, list all Gaussian primes + with 
Nx < 100. 
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2. Find the Gaussian prime decompositions of (a) 7-+ 61, (b) 7-+ 5, 
(c) 8+ 54. [Hint: First factor Na in Z.] 

3. (a) Show that if p = 3 (mod 4) is a rational prime and n is an integer, 
then pt(n? + 1). (Use the argument of the first paragraph of the proof of 
Theorem 6-9.) 

(b) Let 5, 13, 17,..., px be the first k primes p = 1 (mod 4). By considering 
the rational prime factors of N = (5-13-17--+ p,)? + 1, show that there are 
infinitely many p = 1 (mod 4), and hence that there are infinitely many non- 
real Gaussian primes. 


6-6 Another quadratic domain. For purposes of contrast, we terminate 
the discussion of quadratic arithmetic with a brief description of a domain 
of integers in which matters are not so simple as in Z and Z[z]. This domain 
is the set, which we call Z[./10], of numbers of the form a + b»/10, where 
a and 0 are rational integers. It is clear that the sum, difference, and 
product of elements of Z[\/10] are again in Z[\/10]. We shall use capital 
letters A, B, ... to designate elements of Z[\/10}. 

Divisibility, units, and primes can be defined exactly as before: 


We say that B divides A, and write B|A, if there isaC such that A = BC. 

An element Z of Z[\/10] is called a wnit if E/1. 

An element P of Z[+/10] is said to be prime in Z[\/10] if in every fac- 
torization P = AB, either A or B is a unit. 


The norm NA of the integer A = a + bv/10is the product (a + bV/10) X 
(a — bv/10) = a? — 10b*. It is easily seen to be multiplicative, so that 
N(AB) = NANB. This implies that the units are exactly the integers 
with norm + 1: 
1= EP, 
Ni = 1= NENFP, 
NE = NF = +1. 


It is no longer the case that the norm is always nonnegative, so we must 
allow the possibility of NE = —1. A more serious complication is that 
there are now infinitely many units. For the equation NE = a? — 10b? = 
—1 has the solution a = 3, b = 1, so EF = 3+ -+/10 is a unit. Since 
N(z£”) = (NE)” = (—1)", every power of Z is also a unit. Since F > 1, 
it is clear that --- << E71 <1< E < E? <'E3 <.---, and hence 
these powers of EF are distinct, and they constitute an infinite set of units. 
It should be noted that the units, with the exception of +1, do not have 
absolute value 1, so that associates (elements differing only by a unit factor) 
have the same “size” only in the sense of norm, and not in absolute value. 
It is fortuitous that norm and absolute value nearly coincide for Gaussian 
integers. 
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Another difficulty as regards Z[/10] is the absence of a unique factoriza- 
tion theorem. For example, the element 6 of Z[\/10] has two genuinely 
different prime factorizations, namely 


6 = 2-3 = (4+ V10)(4 — V10). 


To see that these really are different, it suffices to show that 2, 3, and 
4 + +/10 are prime in Z[\/10], and that neither 4 + +/10 nor 4 — V/10 
is an associate of 2. 

To prove the primality of 2, 3, and 4 + +/10, we note first that for no 
A in Z[\/10] do we have |NA| = 2 or 3. For the congruences a? = +2 or 
+8 (mod 10) are insolvable (in other language, the decimal expansion of 
the square of a rational integer never terminates in 2, 3, 7, or 8), and 
hence the equations a? — 10m = +2 or +3 are insolvable in Z, from 
which it follows that the equations a? — 10b? = +2 or +3 also are 
insolvable in Z. 

If 2 = AB, then 4 = NANB, and since |NA| + 2, either [NA] or 
[NB| is 1. Thus 2 is prime in Z[,/10]. If 3 = AB, then 9 = NANB, and 
since |NA| ~ 3, either |NA| or |NB| is 1. Thus 3 is prime in Z[,/10]. 
Finally, if 4 + /10 = AB, then 6 = NANB, and since |NA| = 2, 3, 
either |NA| or |NB| is 1. Thus 4 + +/10 is prime in Z[/10]. 

To see that neither 4 + ./10 nor 4 — +/10 is an associate of 2, we 
compare norms. If 2 = E(4 + +/10), where Z is a unit of Z[,/10], then 
N2 = 4 = NE-N(4+/10) = +1-6, which is false. Similarly, 
2 # E44 — V10).a 

This example shows that there is no Euclidean algorithm in Z[./10], 
and hence not even a division theorem of the customary sort. That is, it 
may not be possible to find Q and R# such that A = BQ + R, if it is 
required that |NR| be smaller than |NB|. Moreover, two integers of 
Z[»/10] do not always have a GCD which is a linear combination of those 
integers. It should be apparent that the lack of unique factorization in 
Z{\/10] is a much more serious difficulty than the existence of infinitely 
many units; the latter is surprising at first, and leads to some slight com- 
plications, but it does not invalidate whole sections of the usual arith- 
metic theory. The attempt to restore unique factorization in domains 
such as Z[+/10] was one of the starting points of modern abstract algebra. 
The successful solution of the problem is too lengthy for inclusion here. 


CHAPTER 7 
DIOPHANTINE EQUATIONS 


7-1 Introduction. As was mentioned in Chapter 1, there is rather little 
systematic knowledge that could be called a general theory of Diophantine 
equations, especially of equations of degree larger than 2. Sometimes a 
method especially devised for one problem has applications elsewhere, 
of course; this is true, for instance, of the so-called method of infinite 
descent invented by Fermat, of which we shall give an example when we 
consider the equation x* + y* = z*. (This is really an inductive proof in 
disguise; an equation is shown to have no solution by supposing it has 
solutions, considering a “smallest” solution in some sense, and deriving a 
still smaller solution.) But all too frequently the proofs are entirely ad hoc, 
and of no use for new problems. 

We shall consider here, in addition to the quartic equation above, the 
Pythagorean equation «” + y? = 2? and the Pell equation x? — dy? = 1. 
Both of these equations have infinitely many solutions, which we shall be 
able to describe completely, but in quite different ways. 


7-2 The equation x? + y? = z*. If we know all the primitive solutions 
of this equation, in which (z, y, z) = 1, then we can find all other solutions 
by multiplying x, y, and z by an arbitrary integer d. Among the primitive 
solutions it suffices to find those for which a, y, and z are positive, since all 
others arise from positive solutions by simple sign changes. Finally, in 
any primitive solution exactly one of « and y must be odd, for at least 
one must be odd to give a primitive solution, and if both were odd we 
should have x? + y? = 2 (mod 4), while 2? = 1 or 0 (mod 4). We shall 
discuss the solutions in which z is odd; because of the symmetry of the 
equation in « and y, the solutions in which y is odd can be obtained simply 
by permuting x and y in the solutions now to be described. 


TuerorEM 7-1. A general primitive solution of 
a+y? =z, yeven, x«>0, y>0, z>0, 


is given by 
t=a*—*?, y= 2ab, z= a? + 0?, 


where a and 6 are prime to each other and not both odd, anda > b > 0. 
108 
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Proof: It is easily verified that for every such pair of integers a and 6, 
the corresponding integers x, y, and z satisfy all the requirements. It re- 
mains to show that every solution arises from suitably chosen a and b 
satisfying the conditions of the theorem. 

Suppose that x” + y? = 2”. Since (z, y, z) = 1, we also have (y, z) = 1, 
so that (zg — y,z-+y) = lor 2. But z is odd and y is even, and there- 
fore (e — y,z-+y) = 1. Hence from the equation 


a? = (2 — y)(z+ 9), 


we deduce that z — y and z + y must be odd squares, since they are 
positive. Now if ¢ and wu are integers of the same parity (both even or 
both odd), there are integers a and b such thatt = a+ bandu =a — b, 
namely a = (t+ u)/2 and b = (f — u)/2. Applying this in the case 
where ¢ and u are the odd numbers of which z + y and z — y are the 
squares, respectively, we can set 


z—-y=(a—b)?, z+y=(@+6)’, 
whence 


_ py2 2 
,_& b) tert) ge ye 


2 2 
yo Se 2, 


xz = (a — b)(a+b) = a” — Bb’. 


Since (eg — x,z + x) = 2 because z and z are relatively prime and both 
odd, and since z — x = 2a? andz + « = 2b”, it must be that (a,b) = 1. 
Since x is odd, a + b must be odd. Since y > 0, a and b must have the 
same sign, and since z is positive, |a| > |b|. Finally, since the pairs a, b 
and —a, —b yield the same solution, we can suppose thata > b > 0. A 


PROBLEMS 
1. Referring to Theorem 7-1, show that every solution z, y arises from just 
one pair a, b fulfilling the requirements mentioned there. 


2. Let p be a prime, and suppose that x? + py? = 2, where (z,y,z) = 1. 
Show that, except for the signs of x, y, and z, either 


a Pees 
t= 3 , y = w, eater Goat u and v both odd, 
or 
g=uw— pt, y= 2, z= wt po, exactly one of u and v odd. 
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7-3 The equation x* -+- y* = z+. According to Fermat’s conjecture, 
the equation x” + y” = z” never hasa solution in nonzero integers z, y, z 
ifn > 2. Varieus necessary conditions for the existence of a solution are 
known, and from these it is possible to show that there is no solution for 
many different values of n, but the general conjecture has been neither 
proved nor disproved. Indeed, it is not even known whether there can be 
infinitely many solutions, for certain n > 2. 

If n > 2, then n is divisible either by 4 or by some odd prime, and we 
call this divisor r, whichever it may be. Then n = rm for suitable m, 
and the equation x” + y” = z” is the same as (x™)" +- (y™)" = (2™)’. 
Hence, if it could be shown that the equation X” + Y" = Z" has no non- 
zero solution, then, in particular, there would be no solution X = 2”, 
Y = y”, Z = 2”, and consequently no solution of «* + y” = 2". Thus it 
suffices to consider the Fermat equation for n = 4 or an odd prime. We 
now treat the case n = 4. 


Tueorem 7-2. The equation x* + y* = 2z* is not solvable in nonzero 
integers. 


Proof: It suffices to show that there is no primitive solution of the 
equation 
at + y* — 22, 


Suppose that x, y, and z constitute such a solution; with no loss in gen- 
erality we may take x > 0, y > 0,2 > 0, and yeven. Writing the sup- 
posed relation in the form 

(2)? + (y?)? = 2?, 
we see from Theorem 7-1 that 


Zsa —b?, y= 2b, z= a? +b?, 


where (a,b) = 1 and exactly one of a and b is odd. If a were even, we 
would have 
1 = 2? = a? — b? = —1 (mod 4), 


so b is even. We apply Theorem 7-1 again, this time to the equation 


a? +b? = a2, 
and obtain 
c= p?—@q, b=2pq, a=p?+q, 
where (p,q) = 1, p > g > 0, and not both p and gq are odd. From 
y? = 2ab we have 


y? = 4pq(p” + 9”). 
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Here any two of p, g and p? + @? are relatively prime, and hence each 
must be a square: 

p=, q= 8*, P+eP=?, 
whence ¢ > 1 and 


Now 


c=rt—st, y= 2rst, 2= a7 +b? = r® + 6rtst + 58 
so that 
z> (r*+s*)? = #4, 


or t < 2/*, It follows that if one nonzero solution of «* + y* = 2? 


were known, another solution r, s, ¢ could be found for which rst # 0 
and 1 <t < 24. If we started from r, s, ¢ instead of x, y, z, a third 
solution 7’, s’, t’ could then be found such that 1 < ¢’ < #1/*, and so on. 
But ‘this would yield an infinite decreasing sequence of positive integers, 
2, t, ’,..., which is impossible. Thus there is no nonzero solution. A 


7-4 The equation x? — dy? = 1. The Diophantine equation 
xz? — dy2=N 


(where N and d are integers), commonly known as Pell’s equation, was 
actually never considered by Pell; it was because of a mistake on Euler’s 
part that Pell’s name has been attached to it. The early Greek and Indian 
mathematicians had considered special cases, but Fermat was the first to 
treat it systematically. He said that he had shown, in the special case 
where VN = 1 andd > 0 is not a perfect square, that there are infinitely 
many integral solutions z, y; as usual, he did not give a proof. The first 
published proof was given by Lagrange, who used the theory of continued 
fractions. Prior to this, Euler had shown that there are infinitely many 
solutions if there is one. 

Regardless of the name given to it, the equation is of considerable im- 
portance in number theory. We saw at the end of the preceding chapter 
how it arises in connection with the units of real quadratic domains, a 
subject seemingly removed from Diophantine equations. It also plays 
a central role in the theory of indefinite binary quadratic forms, a more 
advanced branch of the theory of numbers. Even within the theory of 
Diophantine equations, Pell’s equation is fundamental, because so many 
other equations can be reduced to it, or made to depend on it in some way. 
For example, knowledge of the. solutions of Pell’s equation is essential 
in finding integral solutions of the general quadratic equation 


ax” + bay + cy? + dr+ey+f = 0, 
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in which a, b,..., f are integers. For, writing the left side as a polynomial 
ing, 


ax? + (by +d)z+cy?+ey+f = 0, 
we see that if the equation is solvable for a certain y, the discriminant 
(by + d)? — 4a(cy? + ey + f) 
or, what is the same thing, 
(b? — 4ac)y? + (2bd — 4ae)y + d? — 4af 
must be a perfect square, say z”. Setting 


b? ~ 4ac= p, 2bd—4ae=g, d* — 4af=r, 


we have 
py? +qaytr-2=0. 


Again, the discriminant of this quadratic in y must be a perfect square, say 


g — 4p(r — 27) = w?. 


Thus we are led to consider the Pell equation 
w? — 4p2” = gq? — 4pr; 


once we know solutions of this equation, we can, at any rate, obtain 
rational solutions of the original quadratic equation. 

It might also be mentioned that Pell’s equation shares with the linear 
equation ax + by = c a unique position among Diophantine equations 
in two unknowns. It was shown in 1929 by C. L. Siegel that these two 
equations, together with the equations derivable from them by certain 
transformations, are the only algebraic equations in two variables which 
can have infinitely many integral solutions! 

Now to the solution. For the present we shall concern ourselves with 
the equation 

a? — dy? = 1. (1) 


The case in which d is a negative integer is easily settled: if d = —1, 
then the only solutions are +1, 0 and 0, +1, whereas if d < —1, the only 
solutions are --1, 0. So from now on we may restrict our attention to 
equations of the form (1) with d > 0. If dis a square, then (1) can be 
written as 

2? — dy)? = 1, 
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and since the only two squares which differ by 1 are 0 and 1, the only 
solutions in this case are +1, 0. Suppose then that d is not a square, 

Except for the trivial solutions +1, 0, we have ry = 0 and hence four 
solutions, {x,y}, {z, —-y}, {—2, y}, {—2, —y}, which are associated with 
one another in a simple way. Let us confine our attention for the moment 
to the positive solutions, in which x > 0 and y > 0. Equation (1) can 
be written in the form (tx — yv/d)(z + yJ/d) = 1, or 


yl ee (2) 
z+yVvd 


and for large x and y the right-hand side of this equation is very small. 
Hence + — yV/4, or y(z/y — Vad), is also small, which means that x/y 
is required to be a very good rational approximation to the irrational 
number +/d. It must, in fact, be such a good approximation that even 
the product of the error z/y — /d and the large number y is very small. 
This is a strong condition, and is satisfied only by exceptional fractions — 
x/y. But it must be satisfied infinitely many times, if (1) is to have in- 
finitely many solutions. 
Conversely, if we could find positive integers x and y such that 


2 
r+yVd 


then we would have 0 < (2 — yvV/d)(«& + yVd) = x? — dy? < 2, and 
since x? — dy? is an integer, it would follow that 2? — dy? = 1. Thus 
solutions of (1) give good approximations to \/d, and sufficiently good 
approximations to d provide solutions of (1). As was seen in Chapter 5, 
the best approximations to an irrational number are furnished by the 
convergents of the continued fraction expansion of that number, and 
therefore we first look to see how (2) is related to the inequalities of 
Chapter 5. 


0<2r-— yd < 


TurorrM 7-3. If (1) holds, and x and y are positive, then z/y is a 
convergent of the continued fraction expansion of ,/d. 


Proof: By (2),  — yVd > 0, so that z/y > Vd. Hence (2) implies 
that , 
1 1 


1 
PEt “aya 


and the result follows from Theorem 5-14. A 


‘ale 
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The converse of Theorem 7-3 is in general false. Instead, we have 
the following result, which will lead to solutions of (1) in a rather round- 
about way. 


TuHrorEM 7-4. Every convergent p,/g, of the continued fraction ex- 
pansion of +/d provides a solution z = pn, y = Qn of one of the equa- 
tions 

x? — dy? = k, 


where k ranges over the finitely many integers such that |k] < 1 + 2./d. 
In particular, one of these equations has infinitely many solutions. 


Proof: By Theorem 5-12, we have 


ev < 


Qn 
and hence 
1 
lpn — QnVd| < re (3) 
and 
Pe < Vd+— 5 Vath. (4) 


n 


‘Thus, using first (3) and then (4), we obtain 


lpr — daa < td Pt a < Vat 


In the remainder of the discussion of Pell’s equation, we shall have more 
occasion to speak of the combination z+ y\/d than of the solution 
{z, y}. For this reason we pause for a moment to consider these irrational 
numbers as interesting objects in themselves. Let us designate by Z[./d] 
the set of all numbers of the form a + b/d in which a and b are integers, 
and use Greek letters to stand for the elements of Z[./d]. Ifa = a + bV/4d, 
then a and } are called the components of a. For a and 8 in Z[./d], the 
combinations «8, a + 8, and a — 8 are again in Z[./d], but. 8/a need 
not be. For if by the conjugate of a = a+ b\/d we mean the element 
a= a — bVd of Z[V/d], then 
B_ pe Ba 

a 


<— a ee 
a a a2 — db2 


and this is an element of Z[/d] only if its components-are integers, that 
is, only if a? — db? divides the components of Ba. 
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If m is a positive integer, then we say that m divides a, or that a = 0 
(mod m), if m divides both components of a. We say that a = 8 (mod m) 
if a — 6 =0(modm). By completely trivial modifications in the proofs 
given at the beginning of Chapter 3, we see that this kind of congruence 
is again an equivalence relation, that two congruences with the same 
modulus can be added or multiplied together, and that a congruence can 
be multiplied through by an arbitrary factor from Z[\/d]. There is one 
slight change in that there are now m? residue classes, rather than m as 
before, because each of the two components can have any of m incon- 
gruent values. 

Just as in the preceding chapter, we call the product a& the norm of a, 
and write N(a), or simply Na. To ask for solutions of x — dy? = k is 
to ask for elements a of Z[\/d] such that Na = k. Clearly Neg = 
(a8) (a8) = afag = (a@)(88) = NaNBp. 

If the components of a@ are positive, then a, which is a real number as 
well as an element of Z[./d], is larger than 1. The four elements of Z [Vd] 
which have the same components as a except for sign are a, &, —a, and 
—a. If a has as components a positive solution of (1), then a& = 1, and 
the four numbers just mentioned are a, 1/a, —a, and —1/a. Of these 
the first is larger than 1, the second is between 0 and 1, the third is smaller 
than —1, and the fourth lies between —1 and 0, so that the signs of x 
and y in (1) determine, and are determined by, the size of the associated a. 
To consider positive solutions of (1) is to consider elements a > 1 of 


Za]. 
TuroreM 7-5. Equation (1) has at least one solution with y # 0. 


Proof: According to Theorem 7-4, there is an integer k for which the 
equation Na = k has infinitely many solutions a in Z[/d]. Since there 
are only finitely many residue classes (mod k) in Z[./d], some residue class 
must contain at least two of these solutions (in fact, infinitely many!). 
Let us assume then that Na, = Nae = kand a; = ae (mod k), but that 
ay F a. Then Q)Ag = Ah, = 0 (mod k), so that B= Q1a2/k is an 
element of Z[\/d]; that is, it has integral components. Since 

a QQ bi A1a2 _ Na,Nag 


Ng = 6B = k2 a 2 = 1, 


@ yields a solution of (1). If the second component of 8 were 0, then 
Ng = 1 would imply that 8 = 1, whence 


aya = k = a1&, 
@o => &, 
ao = Q, 


contrary to hypothesis. A 
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Tueorem 7-6. If x1, y: and 22, ye are solutions of the Pell equation 
(1), then so are the integers x, y defined by the equation 


(a1 + yivV/d) (te + yoVd) = x + ya. (5) 


Proof: The theorem merely asserts that if No = 1 and Ng = 1, then 
N(o8) = La 

Theorem 7-5 shows that there is an a in Z[,/d] such that a > 1 and 
Na = 1, and Theorem 7-6 demonstrates that all the powers a” give solu- 
tions of (1). Since a < a? < a? <..., we see that (1) has infinitely 
many distinct solutions. The next theorem shows that all the solutions 
arise, in essence, from a single one. 


TuHEeorEM 7-7. If x1, y1 is the minimal positive solution of equation (1), 
then every solution z, y is given by the equation 


etyJfd = £(m + yiV 4d)", (6) 
where n can assume any integral value, positive, negative, or zero. 


Because of this theorem, the minimal positive solution of (1) is some- 
times called the fundamental solution. 


Proof: We have already seen that the four numbers a”, 1/a”, —a”, 
and —1/a" give four solutions differing only in the signs of x and y, so 
we need only show that every a > 1 such that Na = 1 is of the form 
a = 8" for suitable positive integer n. Here 6 is the fundamental solu- 
tion, and therefore it is the smallest element of Z{./d] which is larger 
than 1 and has norm 1. 

Since a > 1 and éis minimal, we havea > 6. Hence there is a positive 
integer n such that 6° < a < 6"*!. Now a/é" = ad” is in Z[V/d], 
and N(a/é") = 1. In other words, the number a/é” = @ gives an in- 
tegral solution of (1). From the definition of n it follows that 1 < B < 6, 
and by the definition of 6 we cannot have 1 < 8B < 6. Hence 6 = 1, 
anda = 6. A 


PROBLEMS 


1. Modify the proof of Theorem 7-3 to show that if 2? — dy? = N, 
0< WN < Vd, then z/y is a convergent of the continued fraction expansion of 
vd. 

2. Show that if x? — dy®= —N,0<N< /d, then z/y is a convergent of 
the continued fraction expansion of »/d. [Hint: Show that 


1 
0<y-—< 


vi 
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and deduce that 


Then use the simple relation which exists between the convergents of Vd and 
those of 1//d, as indicated in Problem 7, Section 5-4.] 


7-5 The equation x? — dy? = —1. This equation differs from 
x? — dy? = 1 (+1 on the right hand side) in that it may well have no 
solutions at all with y ~ 0. For if x? — dy? = —1, then x? = ~1 


(mod d), so that —1 must be a quadratic residue of d. This is certainly 
not the case, for example, if d is any prime p = 3 (mod 4), according to 
Theorem 3-15. On the other hand, if the equation is solvable, the struc- 
ture of the set of solutions is similar to that considered in the preceding 
section. 


THEOREM 7-8. Let d be a positive nonsquare integer. Then if the 
equation 
2? — di? = —1 (7) 


is solvable, and if Y = 2, + t,\/d is the minimal positive solution, a 
general solution is given by 


etivd = +77*!, n= 0,41, +2,.... 
With the earlier notation, 6 = Y?. 


Proof: We prove the second assertion first. It is clear that Y? is a solu- 
tion of (1), since N(v?) = (Nv)? = 1, and hence 1 < 6 < Y?, by the 
definition of 6. Since 1/Y = —7, we have 


E < —87 <7, (8) 


and N(— 6¥) = NéNy = —1. Thus —67 = € isa solution of (7), and 
in particular 6 ~ 1. Taking reciprocals in (8) yields 1/y < 1/8 < Y¥, 
and hence either 


1 
1<p<sy or a 


Using the minimality of Y, we conclude that 8 = 7, and hence that 
i= 7, 

Now suppose that 8 is any solution of (7); we can again restrict attention 
to the case 8 > 1. Then as in the proof of Theorem 7-7, there is a posi- 
tive integer n such that 


1< pa" <8=Y7’, 
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and dividing through by Y, we obtain 
oo Soe, 


where a = 88 “yY~! is a solution of (1). Since 1 < y < 6, the last 
inequality implies that 6~' < a < 8, so that a = 1 and B= sy = 
yeetl A 


PROBLEMS 


1. Let &, be the “fractional part” of nv/2: 
&, = nV/2 — [nv/2], 


and let ¢ be a positive real number independent of n. (a) Show that if 
t< 1/22, then n&, > ¢ for all sufficiently large t. (b) Show that if 
t > 1/22, then for a suitable sequence m1 < no < +-: of positive integers, 
NeEn, ie 7 


7-6 Pell’s equation and continued fractions. We can now make more 
precise the connection established in Section 7-4 between the convergents 
to ~/d and the solutions of 2? — dy? = +1. By Theorem 7-3 and Prob- 
lem 2 of Section 7-4, we know that all such solutions are to be found among 
the convergents of ./d. The problem is to determine how far out one must 
hunt to find the minimal solutions. With respect to the equation with 
+1 on the right, the situation is not too bad; we know that a solution 
exists, and that it must show up eventually among the quantities 
p2 — dq?. But for the equation with —1 on the right, we do not yet have 
a criterion for deciding whether the equation is solvable, and it would 
be desirable to know that if no solution turns up among the first M con- 
vergents, for suitable M depending on d, then there is no solution. The 
following theorem clarifies the situation. 


Tuxorem 7-9. The sequence {p2 — dg?} is eventually periodic, the 
periodicity beginning (at the latest) at the index preceding that at 
which the sequence of partial quotients becomes periodic. The length 
of the period of the first sequence is at most twice that of the second. 


Proof: With = x = V/d, equation (11) of Chapter 5 yields 


5 Peake + Peo | 
“= Qe—1kk + G—2 (9) 


Solving for & and rationalizing the denominator, we can write 


_vdtn 
ies ae 


& 
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where r,; and s; are rational numbers. Substituting this into (9), and 
then replacing k by k + 1 throughout, we have 


a= ped + rei) + Pe—18k41 
ged + reo) + Qb—18k-41 


or 


(qari + Qe—18e41 — ped — (pe—18k41 + peress — Gd) = 0. 
The rational and irrational parts must separately be zero, so 
QeTk4+1 + Wk—-18k41 = Pk, Peretr + Pke—18k41 = Qed. 
The determinant of this system is qxpr_1 — Qe—1Pr = (—1)*, and hence 


reat = (—1)*(pepe—1 — 1-14), 
Se41 = (—1)*(qrd — pi). (10) 


Now the numbers rz and s;, are uniquely determined by &; since {&} 
is eventually periodic, the same is true of {s,}, and the eventual periodic- 
ity of {pz — dgZ} follows from (10). In fact, {s,} becomes periodic at 
the same index as { &},s8o {8241} becomes periodic at the preceding index. 
Finally, the length of the period of {(—1)*—1s,} is clearly at most twice 
that of {s.}. A 
As an example, consider the equations 7 — 7y? = +1. Continuing 
the computations of Section 5-5, we can add a new row to the table 
occurring there: 
k: 0 1 2 3 4 5 Crsceis 


pi — 742: —3 2 —3 1-3 2 -8... 


In this case the sequence {p? — 7q%} is periodic from the beginning, with 
period length 4. (The sequence of partial quotients is periodic after ao, 
with period length 4.) The minimal positive solution of x? — 7y? = 1 
is c = 8, y = 3. There is no solution of x? — 7y? = —1, since none 
shows up in the first period. 

Consider instead x7 — 5y? = +1. The continued fraction expansion 
of 5 is {25 4, 4, 4,...}, the convergents are 2/1, 9/4, 38/17,..., and 
the sequence {p? — 5qg} is {—1, 1, —1, 1,...}. The period length of the 
latter sequence is twice that of {a,}, and both Pell equations are solvable. 

We shall not prove it, but what has happened in these examples is 
typical: the continued fraction expansion of \/d is always periodic after 
ao, so that {p? — dgZ} is always periodic from the beginning, and the 
equation x? — dy? = —1 is solvable if and only if the period length of 
{a;} is odd. 
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PRoBLEMS 
1. Find the minimum positive solutions of 
(a) z? — 94y? = 1, (b) 2? — 95y? = 1. 
2. Using the results of Problems 1 and 2 of Section 7-4, find all the N with 
|N| < 10 for which the equation z? — 95y? = N is solvable. 
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193 
269 
349 
431 
503 
599 
673 
761 
857 
947 
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TABLE OF PRIMES 


130 O17 

67 71 
131 137 
197 199 
271 277 
353 359 
433 439 
509 521 
601 607 
677 = 683 
769 773 
859 863 
953 967 


PRIMITIVE ROOTS 


~ Smallest primitive 


19 

73 
139 
211 
281 
367 
443 
523 
613 
691 
787 
877 
971 


23 

79 
149 
223 
283 
373 
449 
541 
617 
701 
797 
881 
977 


29 

83 
151 
227 
293 
379 
457 
547 
619 
709 
809 
883 
983 


root g of p P g 

3 2 43 3 
5 2 47 5 
7 3 53 2 
ll 2 59 2 
13 2 61 2 
17 3 67 2 
19 2 71 7 
23 5 73 5 
29 2 79 3 
31 3 83 2 
37 2 89 3 
41 6 97 5 


31 

89 
157 
229 
307 
383 
461 
557 
631 
719 
811 
887 
991 


37 

97 
163 
233 
3il 
389 
463 
563 
641 
727 
821 
907 
997 


123 


41 
101 
167 
239 
313 
397 
467 
569 
643 
733 
823 
911 

1009 
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Greek ALPHABET 


A a Alpha N v Nu 

B B Beta & é Xi 

rT Y Gamma oO ° Omicron 

A 5 Delta Il T Pi 

E € Epsilon P p Rho 

Z ¢ Zeta = o Sigma 
H 7 Eta T T Tau 

.e) 6 Theta x u Upsilon 
I t Tota ® ¢,¢ Phi 

K K Kappa x x Chi 

A r Lambda Vv y Psi 

M BL Mu Q w Omega 
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ANSWERS TO 
SELECTED PROBLEMS 


ANSWERS TO SELECTED PROBLEMS 


Section 1-3 

7.8 = (1+ V5)/2, ¢ = 1/6? 8. 8B = 3, c = 2/3 
Section 1-5 

2. (20473488) 12 

Section 2-2 

6. (a) 35; 2 = —83,y = 32 (b) 1; 2 = —1013, y = 534 
8. (a) 17 (c) 19 

Section 2-4 

l2=—-3+7, y=5—8t 


2. There are five solutions, for which x = 11, 31, 51, 71, 91. 
5. A general solution is 


x= 45 — 267r-+ 166s, 
y= 8r-+ As, 
z = —87 + 21Ir — 159s, 

where r and s are arbitrary integers. 

7. 32.65 

Section 3-5 


1. (a) w= 7, 42, or 77 (mod 105) (c) «= 2, 9, 16,..., 100 (mod 105) 
2. The solutions (mod 18) are: 

1,2 3,3 54 7,5 9,0 11,1 13,2 15,3 17,4 

1,8 3,9 5,10 7,11 9,6 11,7 18,8 15,9 17,10 

1,14 3,15 5,16 7,17 9,12 11,13 13,14 15,15 17,16 
3. x = 173 (mod 210) 
5 (a). The system 21 = {1, 0, 0} (mod {27, 25, 8}) is equivalent to 71 = {1, 0} 
(mod {27,200}), or setting +1 = 200y1, we find that it is equivalent to 200y1= 1 
(mod 27). This has the solution yi = 5 (mod 27), whence x1 = 1000 (mod 5400). 


Similarly, we write 
xo = {0, 1, 0} (mod {27, 25, 8}), 


x2 = {1, 0} (mod {25,216}), 

ze = 216ye, 

y2 = 11 (mod 25), 

x2 = 2376 (mod 5400),* 
and in the same way we obtain x2 = 2025 (mod 5400). For this system, this 
yields x = 1021 + 222 + 3x3 = 4627 (mod 5400). 
6. (a) 529 (b) 397 (c) 100 
Section 3-7 
2. —1, 1, —1 
5. (a) Nosolution (b) 2 = 10 or 12 (mod 18) 
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Section 4-1 

6. fa) 3 (b) 380 (ce) 4 

Section 4-2 

6. 2, 3, 8, 12, 13, 17, 22, 23 

7. 5, 7, 10, 11, 14, 15, 17, 19, 20, 21 

Section 4-3 

1. (a) 2 = 10 (mod 29) (c) x = 2, 27 (mod 29) 


(e) x= 9, 24(mod 29) (h) c= 8, 10, 12, 15, 18, 26, 27 (mod 29) 
2. (a) No (b) Yes 


Section 5-1 
1. (a) {2; 8, 5, 2} (c) {53} 2. (a) x = {3; 7, 15, 1, wa}, where 
1067 — 333 
rm = 
355 — 11394 


The approximation + ~ 3.14159265 gives 34 ~ 285.4, whereas a better ap- 
proximation to x shows that [x4] = 292. (b) {1; 1, 2, 1, 2/(V73 — 1)} 

7. 2, 24 

Section 5-2 

1. (a) 302 (c) $, and more generally un41/un, where the uz are the Fibonacci 
numbers. 

Section 5-3 

1. (a) 2 = 276+ 29714, y = 293+ 3154¢ 

Section 6-8 

1. (a) —1+ 52 (ce) 1 [Note that if (a, 8) = 6, then Nd|(Na, N§).] 

2. a = b (mod 2) 

3. (a) —1-+ 52 = 1- (834 177) — (2+ 2)(16 — 22) 

Section 6-4 

2. (a) The first-quadrant integers a with Na < 9 are 1, 2%, 31, 1+ 1, 1+ 21, 
2+ 74,24 2%. Of these, 27, 1 + 1, and 2+ 22 are multiples of 1+ 7%. The re- 
maining numbers must be primes or units, so the Gaussian primes with norms 
<9 are exactly the associates of 37, 1 + 22, and 2+ 4. 

3. (a) Prime (b) (1+ #)(2 — 52) 

Section 6-5 

1. The Gaussian primes with norms <50 are the associates of 1+ 2, 3, 7, 
Lt 21,24 33,4247,5 + 2,6414,5 + 41. 

2. (a) (Ll — 2i)(—1+ 47) = (b) 1+ (6 — 4) 

Section 7-6 
1. (a) « = 2,143,295, y = 221,064 (b) z= 39,y =4 
2.N =1,5 
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Exponent, 63 
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Fermat’s conjecture, 7, 110 
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